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MJCLEIC ACID SEQUENCES TO PROTEINS INVOLVED IN 

TOCOPHEROL SYNTHESIS 



INTRODUCTION 

TECHNICAL FIELD 

10 The present invention is directed to nucleic acid and amino acid sequences and 

constructs, and methods related thereto. 
BACKGROUND 

Isoprenoids are ubiquitous compounds found in all living organisms. Plants 
synthesize a diverse array of greater than 22,000 isoprenoids (Connolly and Hill 

15 (1992) Dictionary of Terpenoids, Chapman and Hall, New York, NY). In plants, 
isoprenoids play essential roles in particular cell functions such as production of 
sterols, contributing to eukaryotic membrane architecture, acyclic polyprenoids found 
in the side chain of ubiquinone and plastoquinone, growth regulators like abscisic 
acid, gibberellins, brassinosteroids or the photosynthetic pigments chlorophylls and 

2 0 carotenoids. Although the physiological role of other plant isoprenoids is less evident, 
like that of the vast array of secondary metabolites, some are known to play key roles 
mediating the adaptative responses to different environmental challenges. In spite of 
the remarkable diversity of structure and function, all isoprenoids originate from a 
single metabolic precursor, isopentenyl diphosphate (IPP) (Wright, (1961) Anna, Rev. 

2 5 Biochem. 20:525-548; and Spurgeon and Porter, (1 98 1) in Biosynthesis of Isoprenoid 
Compounds ., Porter and Spurgeon eds (John Wiley, New York) Vol. 1, ppl-46). 

A number of unique and interconnected biochemical pathways derived from 
the isoprenoid pathway leading to secondary metabolites, including tocopherols, exist 
in chloroplasts of higher plants. Tocopherols not only perform vital functions in 
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plants, but are also important from mammalian nutritional perspectives. In plastids, 
tocopherols account for up to 40% of the total quinone pool. 

Tocopherols and tocotrienols (unsaturated tocopherol derivatives) are well 
known antioxidants, and play an important role in protecting cells from free radical 
5 damage, and in the prevention of many diseases, including cardiac disease, cancer, 
cataracts, retinopathy, Alzheimer's disease, and neurodegeneration, and have been 
shown to have beneficial effects on symptoms of arthritis, and in anti-aging. Vitamin 
E is used in chicken feed for improving the shelf life, appearance, flavor, and 
oxidative stability of meat, and to transfer tocols from feed to eggs. Vitamin E has 
1 0 been shown to be essential for normal reproduction, improves overall performance, 
and enhances immunocompetence in livestock animals. Vitamin E supplement in 
animal feed also imparts oxidative stability to milk products. 

The demand for natural tocopherols as supplements has been steadily growing 
at a rate of 1 0-20% for the past three years. At present, the demand exceeds the 
1 5 supply for natural tocopherols, which are known to be more biopotent than racemic 
mixtures of synthetically produced tocopherols. Naturally occurring tocopherols are 
all <£-stereomers, whereas synthetic ct-tocopherol is a mixture of eight 4/-ot- 
tocopherol isomers, only one of which (12.5%) is identical to the natural d-a- 
tocopherol. Natural rf-a-tocopherol has the highest vitamin E activity (1.49 IU/mg) 
2 0 when compared to other natural tocopherols or tocotrienols. The synthetic ct- 
tocopherol has a vitamin E activity of 1 . 1 IU/mg. In 1 995, the worldwide market for 
raw refined tocopherols was $1020 million; synthetic materials comprised 85-88% of 
the market, the remaining 12-15% being natural materials. The best sources of natural 
tocopherols and tocotrienols are vegetable oils and grain products. Currently, most of 
25 the natural Vitamin E is produced from y-tocopherol derived from soy oil processing, 
which is subsequently converted to oc-tocopherol by chemical modification (ct- 
tocopherol exhibits the greatest biological activity). 

Methods of enhancing the levels of tocopherols and tocotrienols in plants, 
especially levels of the more desirable compounds that can be used directly, without 
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chemical modification, would be useful to the art as such molecules exhibit better 
functionality and biovailability. 

In addition, methods for the increased production of other isoprenoid derived 
compounds in a host plant cell is desirable. Furthermore, methods for the production of 
5 particular isoprenoid compounds in a host plant cell is also needed. 
SUMMARY OF THE INVENTION 

The present invention is directed to sequences to proteins involved in 
tocopherol synthesis. The polynucleotides and polypeptides of the present invention 
include those derived from prokaryotic and eukaryotic sources. 
10 ' Thus, one aspect of the present invention relates to prenyltransferase, and in 

particular to isolated polynucleotide sequences encoding prenyltransferase proteins 
and polypeptides related thereto. In particular, isolated nucleic acid sequences 
encoding prenyltransferase proteins from bacterial and plant sources are provided. 
In another aspect, the present invention provides isolated polynucleotide 
15 sequences encoding tocopherol cyclase, and polypeptides related thereto. In 

particular, isolated nucleic acid sequences encoding tocopherol cyclase proteins from 
bacterial and plant sources are provided. 

Another aspect of the present invention relates to oligonucleotides which 
include partial or complete prenyltransferase or tocopherol cyclase encoding 
20 sequences. 

It is also an aspect of the present invention to provide recombinant DNA 
constructs which can be used for transcription or transcription and translation 
(expression) of prenyltransferase or tocopherol cyclase. In particular, constructs are 
provided which are capable of transcription or transcription and translation in host 
25 cells. 

In another aspect of the present invention, methods are provided for 
production of prenyltransferase or tocopherol cyclase in a host cell or progeny 
thereof. In particular, host cells are transformed or transfected with a DNA construct 
which can be used for transcription or transcription and translation of 
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prenyltransferase or tocopherol cyclase. The recombinant cells which contain 
prenyltransferase or tocopherol cyclase are also part of the present invention. 

In a further aspect, the present invention relates to methods of using 
polynucleotide and polypeptide sequences to modify the tocopherol content of host 
5 cells, particularly in host plant cells. Plant cells having such a modified tocopherol 
content are also contemplated herein. Methods and cells in which both 
prenyltransferase and tocopherol cyclase are expressed in a host cell are also part of 
the present invention. 



1 0 prenyltransferase or tocopherol cyclase are also considered part of the invention. 
BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides an amino acid sequence alignment between ATPT2, ATPT3, 
ATPT4, ATPT8, and ATPT12 are performed using ClustalW. 

Figure 2 provides a schematic picture of the expression construct pCGN 10800. 
15 Figure 3 provides a schematic picture of the expression construct pCGN10801 . 



The modified plants, seeds and oils obtained by the expression of the 



25 



20 



Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 
Figure 



8 provides a schematic picture of the expression construct pCGNl 0809. 

9 provides a schematic picture of the expression construct pCGNl 08 1 0. 

10 provides a schematic picture of the expression construct pCGN1081 1. 

1 1 provides a schematic picture of the expression construct pCGN10812. 

12 provides a schematic picture of the expression construct pCGN10813. 

13 provides a schematic picture of the expression construct pCGN10814. 

14 provides a schematic picture of the expression construct pCGN10815. 

15 provides a schematic picture of the expression construct pCGN10816. 

16 provides a schematic picture of the expression construct pCGN10817. 

17 provides a schematic picture of the expression construct pCGN10819. 



6 provides a schematic picture of the construct pCGN10807. 

7 provides a schematic picture of the construct pCGN10808. 



4 provides a schematic picture of the expression construct pCGN10803. 

5 provides a schematic picture of the construct pCGN10806. 
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Figure 18 provides a schematic picture of the expression construct pCGNl 0824. 

Figure 19 provides a schematic picture of the expression construct pCGN10825. 

Figure 20 provides a schematic picture of the expression construct pCGN10826. 

Figure 21 provides an amino acid sequence alignment using ClustalW between the 
5 Synechocystis prenyltransferase sequences. 

Figure 22 provides an amino acid sequence of the ATPT2, ATPT3, ATPT4, 
ATPT8, and ATPT12 protein sequences from Arabidopsis and the slrl736, slr0926, 
slll899, slr0056, and the slrl518 amino acid sequences from Synechocystis. 

Figure 23 provides the results of the enzymatic assay from preparations of 
1 0 wild type Synechocystis strain 6803, and Synechocystis slrl 736 knockout 

Figure 24 provides bar graphs of HPLC data obtained from seed extracts of 
transgenic Arabidopsis containing pCGN10822, which provides of the expression of 
the ATPT2 sequence, in the sense orientation, from the napin promoter. Provided are 
graphs for alpha, gamma, and delta tocopherols, as well as total tocopherol for 22 
15 transformed lines, as well as a nontransformed (wildtype) control. 

Figure 25 provides a bar graph of HPLC analysis of seed extracts from 
Arabidopsis plants transformed with pCGN10803 (35S-ATPT2, in the antisense 
orientation), pCGN10822 (line 1625, napin ATPT2 in the sense orientation), 
pCGN10809 (line 1627, 35S-ATPT3 in the sense orientation), a nontransfonned (wt) 
2 0 control, and an empty vector transformed control. 

Figure 26 shows total tocopherol levels measured in T# Arabidopsis seed of 

line. 

Figure 27 shows total tocopherol levels measured in T# Arabidopsis seed of 

line. 

25 Figure 28 shows total tocopherol levels measured in developing canola seed of 

line 10822-1. 

Figure 29: shows results of phytyl prenyltransferase activity assay using 
Synechocystis wild type and slrl 737 knockout mutant membrane preparations. 
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Figure 30 is the chromatograph from an HPLC analysis of Synechocystis 
extracts. 

Figure 3 1 is a sequence alignment of the Arabidopsis homologue with the 
sequence of the public database, 
5 Figure 32 shows the results of hydropathic analysis of slrl737 

Figure 33 shows the results of hydropathic analysis of the Arabidopsis 
homologue of slrl737. 

Figure 34 shows the catalytic mechanism of various cyclase enzymes 
Figure 35 is a sequence alignment of slrl737, slrl737 Arabidopsis homologue 
1 0 and the Arabidopsis chalcone isomerase. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides, inter alia, compositions and methods for 
altering (for example, increasing and decreasing) the tocopherol levels and/or 
modulating their ratios in host cells. In particular, the present invention provides 
1 5 polynucleotides, polypeptides, and methods of use thereof for the modulation of 
tocopherol content in host plant cells. 

The biosynthesis of ct-tocopheroi in higher plants involves condensation of 
homogentisic acid and phytylpyrophosphate to form 2-methyl-6 phytylbenzoquinol 
that can, by cyclization and subsequent methylations (Fiedler et al., 1982, Planta, 155: 
20 511-515, Soli et al., 1980, Arch Biochem. Biophys. 204: 544-550, Marshall et al., 

1985 Phytochem., 24: 1705-171 1, all of which are herein incoiporated by reference in 
their entirety), form various tocopherols. 

The Arabidopsis pds2 mutant identified and characterized by Norris et al 
(1995), is deficient in tocopherol and plastiquinone-9 accumulation. Further genetic 
25 and biochemical analysis suggested that the protein encoded by PDS2 may be 

responsible for the prenylation of homogentisic acid. The PDS2 locus identified by 
Norris et al (1995) has been hypothesized to possibly encode the tocopherol phytyl- 
prenyltransferase, as the pds2 mutant fails to accumulate tocopherols. 
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Norris et ah (1995) determined that in Arabidopsis pds2 lies at the top of 
chromosome 3, approximately 7 centimorgans above long hypocotyl2, based on the 
genetic map. ATPT2 is located on chromosome 2 between 36 and 41 centimorgans, 
lying on BAG F19F24, indicating that ATPT2 does not correspond to PDS2. Thus, it 
5 is an aspect of the present invention to provide novel polynucleotides and 

polypeptides involved in the prenylation of homogentisic acid. This reaction may be 
a rate limiting step in tocopherol biosynthesis, and this gene has yet to be isolated. 

U.S. Patent No. 5,432,069 describes the partial purification and 
characterization of tocopherol cyclase from Chlorella protothecoides, Dunaliella 
10 salina and wheat. The cyclase described as being glycine rich, water soluble and with 
a predicted MW of 48-50kD& However, only limited peptide fragment sequences 
were available. 

In one aspect, the present invention provides polynucleotide and polypeptide 
sequences involved in the prenylation of straight chain and aromatic compounds. 

15 Straight chain prenyltransferases as used herein comprises sequences which encode 
proteins involved in the prenylation of straight chain compounds, including, but not 
limited to, geranyl geranyl pyrophosphate and farnesyl pyrophosphate. Aromatic 
prenyltransferases, as used herein, comprises sequences which encode proteins 
involved in the prenylation of aromatic compounds, including, but not limited to, 

2 0 menaquinone, ubiquinone, chlorophyll, and homogentisic acid. The prenyltransferase 
of the present invention preferably prenylates homogentisic acid. 

In another aspect, the invention provides polynucleotide and polypeptide 
sequences to tocopherol cyclization enzymes. The 2,3-dimethyl-5-phvtylplastoquinol 
cyclase toc opherol cyclase) is responsible for the cyclization of 2,3-dimethyl-5- 

2 5 phytylplastoquinol to tocopherol. 

Isolated Polynucleotides, Proteins, and Polypeptides 

A first aspect of the present invention relates to isolated prenyltransferase 
polynucleotides. Another aspect of the present invention relates to isolated tocopherol 
cyclase polynucleotides. The polynucleotide sequences of the present invention 
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include isolated polynucleotides that encode the polypeptides of the invention having 
a deduced amino acid sequence selected from the group of sequences set forth in the 
Sequence Listing and to other polynucleotide sequences closely related to such 
sequences and variants thereof. 
5 The invention provides a polynucleotide sequence identical over its entire 

length to each coding sequence as set forth in the Sequence Listing. The invention 
also provides the coding sequence for the mature polypeptide or a fragment thereof, as 
well as the coding sequence for the mature polypeptide or a fragment thereof in a 
reading frame with other coding sequences, such as those encoding a leader or 

1 0 secretory sequence, a pre-, pro-, or prepro- protein sequence. The polynucleotide can 
also include non-coding sequences, including for example, but not limited to, non- 
coding 5' and 3' sequences, such as the transcribed, untranslated sequences, 
termination signals, ribosome binding sites, sequences that stabilize mRNA, introns, 
polyadenylation signals, and additional coding sequence that encodes additional 

15 amino acids. For example, a marker sequence can be included to facilitate the 

purification of the fused polypeptide. Polynucleotides of the present invention also 
include polynucleotides comprising a structural gene and the naturally associated 
sequences that control gene expression. 

The invention also includes polynucleotides of the formula: 

wherein, at the 5' end, X is hydrogen, and at the 3 ' end, Y is hydrogen or a metal, R x 
and R 3 are any nucleic acid residue, n is an integer between 1 and 3000, preferably 
between 1 and 1000 and R^ is a nucleic acid sequence of the invention, particularly a 
nucleic acid sequence selected from the group set forth in the Sequence Listing and 
25 preferably those of SEQ ID NOs: 1, 3, 5, 7, 8, 10, 11, 13-16, 18, 23, 29, 36, and 38. 
In the formula, R 2 is oriented so that its 5' end residue is at the left, bound to R„ and 
its 3' end residue is at the right, bound to R 3 . Any stretch of nucleic acid residues 
denoted by either R group, where R is greater than 1, may be either a heteropolymer 
or a homopolymer, preferably a heteropolymer. 
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The invention also relates to variants of the polynucleotides described herein 
that encode for variants of the polypeptides of the invention. Variants that are 
fragments of the polynucleotides of the invention can be used to synthesize full-length 
polynucleotides of the invention. Preferred embodiments are polynucleotides 
5 encoding polypeptide variants wherein 5 to 10, 1 to 5, 1 to 3, 2, 1 or no amino acid 
residues of a polypeptide sequence of the invention are substituted, added or deleted, 
in any combination. Particularly preferred are substitutions, additions, and deletions 
that are silent such that they do not alter the properties or activities of the 
polynucleotide or polypeptide. 
1 0 Further preferred embodiments of the invention that are at least 5 0%, 60%, or 

70% identical over their entire length to a polynucleotide encoding a polypeptide of 
the invention, and polynucleotides that are complementary to such polynucleotides. 
More preferable are polynucleotides that comprise a region that is at least 80% 
identical over its entire length to a polynucleotide encoding a polypeptide of the 

1 5 invention and polynucleotides that are complementary thereto. In this regard, 
polynucleotides at least 90% identical over their entire length are particularly 
preferred, those at least 95% identical are especially preferred. Further, those with at 
least 97% identity are highly preferred and those with at least 98% and 99% identity 
are particularly highly preferred, with those at least 99% being the most highly 

20 preferred. 

Preferred embodiments are polynucleotides that encode polypeptides that 
retain substantially the same biological function or activity as the mature polypeptides 
encoded by the polynucleotides set forth in the Sequence Listing. 

The invention further relates to polynucleotides that hybridize to the above- 
2 5 described sequences. In particular, the invention relates to polynucleotides that 

hybridize under stringent conditions to the above-described polynucleotides. As used 
herein, the terms "stringent conditions" and "stringent hybridization conditions" mean 
that hybridization will generally occur if there is at least 95% and preferably at least 
97% identity between the sequences. An example of stringent hybridization 
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conditions is overnight incubation at 42°C in a solution comprising 50% formamide, 
5x SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 
7.6), 5x Denhardt's solution, 10% dextran sulfate, and 20 micrograms/milliliter 
denatured, sheared salmon sperm DNA, followed by washing the hybridization 
5 support in 0. lx SSC at approximately 65°C. Other hybridization and wash conditions 
are well known and are exemplified in Sambrook, et al, Molecular Cloning: A 
Laboratory Manual, Second Edition, cold Spring Harbor, NY (1989), particularly 
Chapter 11. 

The invention also provides a polynucleotide consisting essentially of a 
1 0 polynucleotide sequence obtainable by screening an appropriate library containing the 
complete gene for a polynucleotide sequence set for in the Sequence Listing under 
stringent hybridization conditions with a probe having the sequence of said 
polynucleotide sequence or a fragment thereof; and isolating said polynucleotide 
sequence. Fragments useful for obtaining such a polynucleotide include, for example, 
15 probes and primers as described herein. 

As discussed herein regarding polynucleotide assays of the invention, for 
example, polynucleotides of the invention can be used as a hybridization probe for 
RNA, cDNA, or genomic DNA to isolate full length cDNAs or genomic clones 
encoding a polypeptide and to isolate cDNA or genomic clones of other genes that 
have a high sequence similarity to a polynucleotide set forth in the Sequence Listing. 
Such probes will generally comprise at least 15 bases. Preferably such probes will 
have at least 30 bases and can have at least 50 bases. Particularly preferred probes 
will have between 30 bases and 50 bases, inclusive. 

The coding region of each gene that comprises or is comprised by a 
polynucleotide sequence set forth in the Sequence Listing may be isolated by 
screening using a DNA sequence provided in the Sequence Listing to synthesize an 
oligonucleotide probe. A labeled oligonucleotide having a sequence complementary 
to that of a gene of the invention is then used to screen a library of cDNA, genomic 
DNA or mRNA to identify members of the library which hybridize to the probe. For 



20 



25 
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example, synthetic oligonucleotides are prepared which correspond to the 
prenyltransferase or tocopherol cyclase EST sequences. The oligonucleotides are 
used as primers in polymerase chain reaction (PCR) techniques to obtain 5' and 3' 
terminal sequence of prenyltransferase or tocopherol cyclase genes. Alternatively, 
5 where oligonucleotides of low degeneracy can be prepared from particular 

prenyltransferase or tocopherol cyclase peptides, such probes may be used directly to 
screen gene libraries for prenyltransferase or tocopherol cyclase gene sequences. In 
particular, screening of cDNA libraries in phage vectors is useful in such methods due 
to lower levels of background hybridization. 

1 0 Typically, a prenyltransferase or tocopherol cyclase sequence obtainable from 

the use of nucleic acid probes will show 60-70% sequence identity between the target 
prenyltransferase or tocopherol cyclase sequence and the encoding sequence used as a 
probe. However, lengthy sequences with as little as 50-60% sequence identity may 
also be obtained. The nucleic acid probes may be a lengthy fragment of the nucleic 

1 5 acid sequence, or may also be a shorter, oligonucleotide probe. When longer nucleic 
acid fragments are employed as probes (greater than about 100 bp), one may screen at 
lower stringencies in order to obtain sequences from the target sample which have 20- 
50% deviation (i.e., 50-80% sequence homology) from the sequences used as probe. 
Oligonucleotide probes can be considerably shorter than the entire nucleic acid 

2 0 sequence encoding an prenyltransferase or tocopherol cyclase enzyme, but should be 
at least about 10, preferably at least about 15, and more preferably at least about 20 
nucleotides. A higher degree of sequence identity is desired when shorter regions are 
used as opposed to longer regions. It may thus be desirable to identify regions of 
highly conserved amino acid sequence to design oligonucleotide probes for detecting 

2 5 and recovering other related prenyltransferase or tocopherol cyclase genes. Shorter 
probes are often particularly useful for polymerase chain reactions (PCR), especially 
when highly conserved sequences can be identified. {See, Gould, et al, PNAS USA 
(1989)55:1934-1938.). . 



WO 01/79472 



PCT/US01/12334 



12 



Another aspect of the present invention relates to prenyltransf erase or 
tocopherol cyclase polypeptides. Such polypeptides include isolated polypeptides set 
forth in the Sequence Listing, as well as polypeptides and fragments thereof, 
particularly those polypeptides which exhibit prenyltransferase or tocopherol cyclase 
5 activity and also those polypeptides which have at least 50%, 60% or 70% identity, 
preferably at least 80% identity, more preferably at least 90% identity, and most 
preferably at least 95% identity to a polypeptide sequence selected from the group of 
sequences set forth in the Sequence Listing, and also include portions of such 
polypeptides, wherein such portion of the polypeptide preferably includes at least 30 

1 0 amino acids and more preferably includes at least 50 amino acids. 

"Identity", as is well understood in the art, is a relationship between two or 
more polypeptide sequences or two or more polynucleotide sequences, as determined 
by comparing the sequences. In the art, "identity" also means the degree of sequence 
relatedness between polypeptide or polynucleotide sequences, as determined by the 

15 match between strings of such sequences. "Identity" can be readily calculated by 
known methods including, but not limited to, those described in Computational 
Molecular Biology, Lesk, A.M., ed., Oxford University Press, New York (1988); 
Biocomputing: Informatics and Genome Projects, Smith, D.W., ed., Academic Press, 
New York, 1993; Computer Analysis of Sequence Data, Parti, Griffin, A.M. and 

2 0 Griffin, H.G., eds., Humana Press, New Jersey (1994); Sequence Analysis in 
Molecular Biology, von Heinje, G., Academic Press (1987); Sequence Analysis 
Primer, Gribskov, M. and Devereux, J., eds., Stockton Press, New York (1991); and 
Carillo, H., and Lipman, D., SIAM J Applied Math, 48:1073 (1988). Methods to 
determine identity are designed to give the largest match between the sequences 

2 5 tested. Moreover, methods to determine identity are codified in publicly available 
programs. Computer programs which can be used to determine identity between two 
sequences include, but are not limited to, GCG (Devereux, J., et al., Nucleic Acids 
Research 12(1):387 (1984); suite of five BLAST programs, three designed for 
nucleotide sequences queries (BLASTN, BLASTX, and TBLASTX) and two 
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designed for protein sequence queries (BLASTP and TBLASTN) (Coulson, Trends in 
Biotechnology, 12: 76-80 (1994); Birren, et al, Genome Analysis, 1: 543-559 (1997)). 
The BLAST X program is publicly available from NCBI and other sources (BLAST 
Manual, Altschul, S., et al, NCBI NLM NTH, Bethesda, MD 20894; Altschul, S., et 
5 al, J. Mot Biol, 215:403-410 (1990)). The well known Smith Waterman algorithm 
can also be used to determine identity. 

Parameters for polypeptide sequence comparison typically include the 
following: 

Algorithm: Needleman and Wunsch, 7. Mot Biol 48:443-453 (1970) 
1 0 Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl 

Acad. Sci USA 89:10915-10919(1992) 
Gap Penalty: 12 
Gap Length Penalty: 4 

A program which can be used with these parameters is publicly available as 
15 the "gap" program from Genetics Computer Group, Madison Wisconsin. The above 
parameters along with no penalty for end gap are the default parameters for peptide 
comparisons. 

Parameters for polynucleotide sequence comparison include the following: 
Algorithm: Needleman and Wunsch, X Mol. Biol. 48:443^53 (1970) 
2 0 Comparison matrix: matches = +10; mismatches = 0 

Gap Penalty: 50 
Gap Length Penalty: 3 

A program which can be used with these parameters is publicly available as 
the "gap" program from Genetics Computer Group, Madison Wisconsin. The above 
2 5 parameters are the default parameters for nucleic acid comparisons. 

The invention also includes polypeptides of the formula: 

X-(R0„-(R2)-(R3)n-Y 

wherein, at the jimino terminus, X is hydrogen, and at the carboxyl terminus, Y is 
hydrogen or a metal, R, and R 3 are any amino acid residue, n is an integer between 1 
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and 1000, and R 2 is an amino acid sequence of the invention, particularly an amino 
acid sequence selected from the group set forth in the Sequence Listing and preferably 
those encoded by the sequences provided in SEQ ID NOs: 2, 4, 6, 9, 12, 17, 19-22, 
24-28, 30, 32-35, 37, and 39. In the formula, R 2 is oriented so that its amino terminal 
5 residue is at the left, bound to R 1? and its carboxy terminal residue is at the right, 
bound to R 3 . Any stretch of amino acid residues denoted by either R group, where R 
is greater than 1, may be either a heteropolymer or a homopolymer, preferably a 
heteropolymer. 

Polypeptides of the present invention include isolated polypeptides encoded by 
10 a polynucleotide comprising a sequence selected from the group of a sequence 
contained in the Sequence Listing set forth herein . 

The polypeptides of the present invention can be mature protein or can be part 
of a fusion protein. 

Fragments and variants of the polypeptides are also considered to he a part of 
15 the invention. A fragment is a variant polypeptide which has an amino acid sequence 
that is entirely the same as part but not all of the amino acid sequence of the 
previously described polypeptides. The fragments can be "free-standing" or 
comprised within a larger polypeptide of which the fragment forms a part or a region, 
most preferably as a single continuous region. Preferred fragments are biologically 
2 o active fragments which are those fragments that mediate activities of the polypeptides 
of the invention, including those with similar activity or improved activity or with a 
decreased activity. Also included are those fragments that antigenic or immunogenic 
in an animal, particularly a human. 

Variants of the polypeptide also include polypeptides that vary from the 
2 5 sequences set forth in the Sequence Listing by conservative amino acid substitutions, 
substitution of a residue by another with like characteristics. In general, such 
substitutions are among Ala, Val, Leu and He; between Ser and Thr; between Asp and 
Glu; between Asn and Gin; between Lys and Arg; or between Phe and Tyr. 
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Particularly preferred are variants in which 5 to 10; 1 to 5; 1 to 3 or one amino acid(s) 
are substituted, deleted, or added, in any combination. 

Variants that are fragments of the polypeptides of the invention can be used to 
produce the corresponding full length polypeptide by peptide synthesis. Therefore, 
these variants can be used as intermediates for producing the full-length polypeptides 
of the invention. 

The polynucleotides and polypeptides of the invention can be used, for 
example, in the transformation of host cells, such as plant host cells, as further 
discussed herein. 

The invention also provides polynucleotides that encode a polypeptide that is a 
mature protein plus additional amino or carboxyl-terminal amino acids, or amino 
acids within the mature polypeptide (for example, when the mature form of the 
protein has more than one polypeptide chain). Such sequences can, for example, play 
a role in the processing of a protein from a precursor to a mature form, allow protein 
transport, shorten or lengthen protein half-life, or facilitate manipulation of the protein 
in assays or production. It is contemplated that cellular enzymes can be used to 
remove any additional amino acids from the mature protein. 

A precursor protein, having the mature form of the polypeptide fused to one or 
more prosequences may be an inactive form of the polypeptide. The inactive 
precursors generally are activated when the prosequences are removed. Some or all of 
the prosequences may be removed prior to activation. Such precursor protein are 
generally called proproteins. 
Plant Constructs and Methods of Use 

Of particular interest is the use of the nucleotide sequences in recombinant 
DNA constructs to direct the transcription or transcription and translation (expression) 
of the prenyltransferase or tocopherol cyclase sequences of the present invention in a 
host plant cell. The expression constructs generally comprise a promoter functional in 
a host plant cell operably linked to a nucleic acid sequence encoding a 
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prenyltransferase or tocopherol cyclase of the present invention and a transcriptional 
termination region functional in a host plant cell. 

A first nucleic acid sequence is "operably linked" or "operably associated" 
with a second nucleic acid sequence when the sequences are so arranged that the first 
5 nucleic acid sequence affects the function of the second nucleic-acid sequence. 
Preferably, the two sequences are part of a single contiguous nucleic acid molecule 
and more preferably are adjacent For example, a promoter is operably linked to a 
gene if the promoter regulates or mediates transcription of the gene in a cell. 

Those skilled in the art will recognize that there are a number of promoters 
1 0 which are functional in plant cells, and have been described in the literature. 
Chloroplast and plastid specific promoters, chloroplast or plastid functional 
promoters, and chloroplast or plastid operable promoters are also envisioned. 

One set of plant functional promoters are constitutive promoters such as the 
CaMV35S or FMV35S promoters that yield high levels of expression in most plant 
15 organs. Enhanced or duplicated versions of the CaMV35S and FMV35S promoters 
are useful in the practice of this invention (Odell, et al. (1985) Nature 313:810-812; 
Rogers, U.S. Patent Number 5,378, 619). In addition, it may also be preferred to bring 
about expression of the prenyltransferase or tocopherol cyclase gene in specific tissues 
of the plant, such as leaf, stem, root, tuber, seed, fruit, etc., and the promoter chosen 
2 0 should have the desired tissue and developmental specificity. 

Of particular interest is the expression of the nucleic acid sequences of the 
present invention from transcription initiation regions which are preferentially 
expressed in a plant seed tissue. Examples of such seed preferential transcription 
initiation sequences include those sequences derived from sequences encoding plant 
2 5 storage protein genes or from genes involved in fatty acid biosynthesis in oilseeds. 
Examples of such promoters include the 5' regulatory regions from such genes as 
napin (Kridl et al., Seed Sci. Res. 7:209:219 (1991)), phaseolin, zein, soybean trypsin 
inhibitor, ACP, stearoyl-ACP desaturase, soybean a' subunit of P-conglycinin (soy 
7s, (Chen et al., Proc. Natl. Acad Sci., 83:8560-8564 (1986))) and oleosin. 



WO 01/79472 



PCT/US01/12334 



17 



10 



15 



20 



25 



It may be advantageous to direct the localization of proteins conferring 
prenyltransferase or tocopherol cyclase to a particular subcellular compartment, for 
example, to the mitochondrion, endoplasmic reticulum, vacuoles, chloroplast or other 
plastidic compartment. For example, where the genes of interest of the present 
invention will be targeted to plastids, such as chloroplasts, for expression, the 
constructs will also employ the use of sequences to direct the gene to the plastid. 
Such sequences are referred to herein as chloroplast transit peptides (CTP) or plastid 
transit peptides (PIP). In this manner, where the gene of interest is not directly 
inserted into the plastid, the expression construct will additionally contain a gene 
encoding a transit peptide to direct the gene of interest to the plastid. The chloroplast 
transit peptides may be derived from the gene of interest, or may be derived from a 
heterologous sequence having a CTP. Such transit peptides are known in the art. See, 
for example, Von Heijne et al. (1991) Plant Mol. Biol. Rep. P:104-126; Clark et al. 
(1989) J. Biol. Chem. 2<W: 17544-1 7550; della-Cioppa et al. (1987) Plant Physiol. 
84:965-968; Romer et al. (1993) Biochem. Biophys. Res Commun. 7P<J:1414-1421; 
and, Shah etal. (1986) Science 2ii:478-481. 

Depending upon the intended use, the constructs may contain the nucleic acid 
sequence which encodes the entire prenyltransferase or tocopherol cyclase protein, or 
a portion thereof. For example, where antisense inhibition of a given 
prenyltransferase or tocopherol cyclase protein is desired, the entire prenyltransferase 
or tocopherol cyclase sequence is not required. Furthermore, where prenyltransferase 
or tocopherol cyclase sequences used in constructs are intended for use as probes, it 
may be advantageous to prepare constructs containing only a particular portion of a 
prenyltransferase or tocopherol cyclase encoding sequence, for example a sequence 
which is discovered to encode a highly conserved prenyltransferase or tocopherol 
cyclase region. 

The skilled artisan will recognize that there are various methods for the 
inhibition of expression of endogenous sequences in a host cell. Such methods 
include, but are not limited to, antisense suppression (Smith, et al. (1988) Nature 



WO 01/79472 PCT/US01/12334 



10 



20 



25 



18 



334:724-726) , co-suppression (Napoli, et al. (1989) Plant Cell 2:279-289), 
ribozymes (PCT Publication WO 97/10328), and combinations of sense and antisense 
Waterhouse, et al. (1998) Proc. Natl. Acad. Sci. USA 95:13959-13964. Methods for 
the suppression of endogenous sequences in a host cell typically employ the 
transcription or transcription and translation of at least a portion of the sequence to be 
suppressed. Such sequences may be homologous to coding as well as non-coding 
regions of the endogenous sequence. 

Regulatory transcript termination regions may be provided in plant expression 
constructs of this invention as well. Transcript termination regions may be provided 
by the DNA sequence encoding the prenyltransferase or tocopherol cyclase or a 
convenient transcription termination region derived from a different gene source, for 
example, the transcript tennination region which is naturally associated with the 
transcript initiation region. The skilled artisan will recognize that any convenient 
transcript termination region which is capable of terminating transcription in a plant 
15 cell may be employed in the constructs of the present invention. 

Alternatively, constructs may be prepared to direct the expression of the 
prenyltransferase or tocopherol cyclase sequences directly from the host plant cell 
plastid. Such constructs and methods are known in the art and are generally 
described, for example, in Svab, et al. (1990) Proc. Natl. Acad Sci. USA 87:8526- 
8530 and Svab and Maliga (1993) Proc. Natl. Acad. Sci. USA 90:913-917 and in U.S. 
Patent Number 5,693,507. 

The prenyltransferase or tocopherol cyclase constructs of the present invention 
can be used in transformation methods with additional constructs providing for the 
expression of other nucleic acid sequences encoding proteins involved in the 
production of tocopherols, or tocopherol precursors such as homogentisic acid and/or 
phytylpyrophosphate. Nucleic acid sequences encoding proteins involved in the 

in the art, and include but not are limited 
to, 4-hydroxyphenylpyruvate dioxygenase (HPPD, EC 1.13.11.27) described for 
example, by Garcia, et al. ((1999) Plant Physiol. 11 9(4): 1507- 151 6), mono or 
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Afunctional tyrA (described for example by Xia, et al (1992) J. Gen Microbiol 
138:1309-1316, and Hudson, et al (1984) J, Mol Biol 180:1023-1051), Oxygenase, 
4-hyckoxyphenylpyruvate di- (9CI), 4-Hydroxyphenylpyruvate dioxygenase; 
p-Hydroxyphenylpyruvate dioxygenase; p-Hydroxyphenylpyruvate hydroxylase; 
p-Hydroxyphenylpyruvate oxidase; p-Hydroxyphenylpyruvic acid hydroxylase; 
p-Hydroxyphenylpyruvic hydroxylase; p-Hydroxyphenylpyruvic oxidase), 
4-hydroxyphenylacetate, NAD(P)H:oxygen oxidoreductase (1 -hydroxylating); 
4-hydroxyphenylacetate 1-monooxygenase, and the like. In addition, constructs for 
the expression of nucleic acid sequences encoding proteins involved in the production 
of phytylpyrophosphate can also be employed with the prenyltransferase or tocopherol 
cyclase constructs of the present invention. Nucleic acid sequences encoding proteins 
involved in the production of phytylpyrophosphate are known in the art, and include, 
but are not limited to geranylgeranylpyrophosphate synthase (GGPPS), 
geranylgeranylpyrophosphate reductase (GGH), l-deoxyxylulose-5-phosphate 
synthase, 1- deoxy-D-xylolose-5-phosphate reductoisomerase, 4-diphosphocytidyl~2- 
C-methylerythritol synthase, isopentyl pyrophosphate isomerase. 

The prenyltransferase or tocopherol cyclase sequences of the present invention 
find use in the preparation of transformation constructs having a second expression 
cassette for the expression of additional sequences involved in tocopherol 
biosynthesis. Additional tocopherol biosynthesis sequences of interest in the present 
invention include, but are not limited to gamma-tocpherol methyltransferase 
(Shintani, et al (1998) Science 282(5396):2098-2100), tocopherol cyclase, and 
tocopherol methyltransferase. 

A plant cell, tissue, organ, or plant into which the recombinant DNA 
constructs containing the expression constructs have been introduced is considered 
transformed, transfected, or transgenic. A transgenic or transformed cell or plant also 
includes progeny of the cell or plant and progeny produced from a breeding program 
employing such a transgenic plant as a parent in a cross and exhibiting an altered 
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phenotype resulting from the presence of aprenyltransferase or tocopherol cyclase 
nucleic acid sequence. 

Plant expression or transcription constructs having a prenyltransferase or 
tocopherol cyclase as the DNA sequence of interest for increased or decreased 
expression thereof may be employed with a wide variety of plant life, particularly, 
plant life involved in the production of vegetable oils for edible and industrial uses. 
Particularly preferred plants for use in the methods of the present invention include, 
but are not limited to: Acacia, alfalfa, aneth, apple, apricot, artichoke, arugula, 
asparagus, avocado, banana, barley, beans, beet, blackberry, blueberry, broccoli, 
brussels sprouts, cabbage, canola, cantaloupe, carrot, cassava, cauliflower, celery, 
cherry, chicory, cilantro, citrus, Clementines, coffee, corn, cotton, cucumber, Douglas 
fir, eggplant, endive, escarole, eucalyptus, fennel, figs; garlic, gourd, grape, grapefruit, 
honey dew, jicama, kiwifruit, lettuce, leeks, lemon, lime, Loblolly pine, mango, 
melon, mushroom, nectarine, nut, oat, oil palm, oil seed rape, okra, onion, orange, an 
15 ornamental plant, papaya, parsley, pea, peach, peanut, pear, pepper, persimmon, pine, 
pineapple, plantain, plum, pomegranate, poplar, potato, pumpkin, quince, radiata pine, 
radicchio, radish, raspberry, rice, rye, sorghum, Southern pine, soybean, spinach, 
squash, strawberry, sugarbeet, sugarcane, sunflower, sweet potato, sweetgum, 
tangerine, tea, tobacco, tomato, triticale, turf, turnip, a vine, watermelon, wheat, yams, 
2 0 and zucchini. 

Most especially preferred are temperate oilseed crops. Temperate oilseed 
crops of interest include, but are not limited to, rapeseed (Canola and High Erucic 
Acid varieties), sunflower, saffiower, cotton, soybean, peanut, coconut and oil palms, 
and corn. Depending on the method for introducing the recombinant constructs into 
25 the host cell, other DNA sequences may be required. Importantly, this invention is 
applicable to dicotyledyons and monocotyledons species alike and will be readily 
applicable to new and/or improved transformation and regulation techniques. 

Of particular interest, is the use of prenyltransferase or tocopherol cyclase 
constructs in plants to produce plants or plant parts, includmg, but not limited to 
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leaves, stems, roots, reproductive, and seed, with a modified content of tocopherols in 
plant parts having transformed plant cells. 

For immunological screening, antibodies to the protein can be prepared by 
injecting rabbits or mice with the purified protein or portion thereof, such methods of 
5 preparing antibodies being well known to those in the art. Either monoclonal or 
polyclonal antibodies can be produced, although typically polyclonal antibodies are 
more useful for gene isolation. Western analysis may be conducted to determine that 
a related protein is present in a crude extract of the desired plant species, as 
determined by cross-reaction with the antibodies to the encoded proteins. When 

1 0 cross-reactivity is observed, genes encoding the related proteins are isolated by 
screening expression libraries representing the desired plant species. Expression 
libraries can be constructed in a variety of commercially available vectors, including 
lambda gtll, as described in Sambrook, et al. (Molecular Cloning: A Laboratory 
Manual, Second Edition (1989) Cold Spring Harbor Laboratory, Cold Spring Harbor, 

15 New York). 

To confirm the activity and specificity of the proteins encoded by the 
identified nucleic acid sequences as prenyltransferase or tocopherol cyclase enzymes, 
in vitro assays are performed in insect cell cultures using baculovirus expression 
systems. Such baculovirus expression systems are known in the art and are described 
20 by Lee, et al U.S. Patent Number 5,348,886, the entirety of which is herein 
incorporated by reference. 

In addition, other expression constructs may be prepared to assay for protein 
activity utilizing different expression systems. Such expression constructs are 
transformed into yeast or prokaryotic host and assayed for prenyltransferase or 
tocopherol cyclase activity. Such expression systems are known in the art and are 
readily available through commercial sources. 

In addition to the sequences described in the present invention, DNA coding 
sequences useful in the present invention can be derived from algae, fungi, bacteria, 
mammalian sources, plants, etc. Homology searches in existing databases using 
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signature sequences corresponding to conserved nucleotide and amino acid sequences 
of prenyltransferase or tocopherol cyclase can be employed to isolate equivalent, 
related genes from other sources such as plants and microorganisms. Searches in 
EST databases can also be employed. Furthermore, the use of DNA sequences 
5 encoding enzymes functionally enzymatically equivalent to those disclosed herein, 
wherein such DNA sequences are degenerate equivalents of the nucleic acid 
sequences disclosed herein in accordance with the degeneracy of the genetic code, is 
also encompassed by the present invention. Demonstration of the functionality of 
coding sequences identified by any of these methods can be carried out by 
1 0 complementation of mutants of appropriate organisms, such as Synechocystis, 

Shewanella, yeast, Pseudomonas, Rhodobacteria, etc., that lack specific biochemical 
reactions, or that have been mutated. The sequences of the DNA coding regions can 
be optimized by gene resynthesis, based on codon usage, for maximum expression in 
particular hosts. 

15 For the alteration of tocopherol production in a host cell, a second expression 

construct can be used in accordance with the present invention. For example, the 
prenyltransferase or tocopherol cyclase expression construct can be introduced into a 
host cell in conjunction with a second expression construct having a nucleotide 
sequence for a protein involved in tocopherol biosynthesis. 

20 The method of transformation in obtaining such transgenic plants is not critical 

to the instant invention, and various methods of plant transformation are currently 
available. Furthermore, as newer methods become available to transform crops, they 
may also be directly applied hereunder. For example, many plant species naturally 
susceptible to Agrobacterium infection may be successfully transformed via tripartite 

25 or binary vector methods of Agrobacterium mediated transformation. In many 

instances, it will be desirable to have the construct bordered on one or both sides by 
T-DNA, particularly having the left and right borders, more particularly the right 
border. This is particularly useful when the construct uses A. tumefaciens or A. 
rhizogenes as a mode for transformation, although the T-DNA borders may find use 
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with other modes of transformation. In addition, techniques of microinjection, DNA 
particle bombardment, and electroporation have been developed which allow for the 
transformation of various monocot and dicot plant species. 

Normally, included with the DNA construct will be a structural gene having 
5 the necessary regulatory regions for expression in a host and providing for selection of 
transformant cells. The gene may provide for resistance to a cytotoxic agent, e.g. 
antibiotic, heavy metal, toxin, etc., complementation providing prototrophy to an 
auxotrophic host, viral immunity or the like. Depending upon the number of different 
host species the expression construct or components thereof are introduced, one or 
1 0 more markers may be employed, where different conditions for selection are used for 
•the different hosts. 

Where Agrobacterium is used for plant cell transformation, a vector may be 
used which may be introduced into the Agrobacterium host for homologous 
recombination with T-DNA or the Ti- or Ri-plasmid present in the Agrobacterium 
1 5 host. The Ti- or Ri-plasmid containing the T-DNA for recombination may be armed 
(capable of causing gall formation) or disarmed (incapable of causing gall formation), 
the latter being permissible, so long as the vir genes are present in the transformed 
Agrobacterium host The armed plasmid can give a mixture of normal plant cells and 
gall. 

In some instances where Agrobacterium is used as the vehicle for transforming 
host plant cells, the expression or transcription construct bordered by the T-DNA 
border region(s) will be inserted into a broad host range vector capable of replication 
in & coli and. Agrobacterium, there being broad host range vectors described in the 
literature. Commonly used is pRK2 or derivatives thereof. See, for example, Ditta, et 
25 ah, (Proc. Nat. Acad Sci, U.S.A. (1980) 77:7347-7351) and EPA 0 120 515, which 
are incorporated herein by reference. Alternatively, one may insert the sequences to 
be expressed in plant cells into a vector containing separate replication sequences, one 
of which stabilizes the vector in E. coli, and the other in Agrobacterium. See, for 
example, McBride, etal. (Plant Mol. Biol. (1990) 14:269-276), wherein the pRiHRI 
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(Jouanin, et al, Mol. Gen. Genet. (1985) 201:370-374) origin of replication is utilized 
and provides for added stability of the plant expression vectors in host Agrobacterium 
cells. 

Included with the expression construct and the T-DNA will be one or more 
markers, which allow for selection of transformed Agrobacterium and transformed 
plant cells. A 

number of markers have been developed for use with plant cells, such as resistance to 
chloramphenicol, kanamycin, the aminoglycoside G418, hygromycin, or the like. The 
particular marker employed is not essential to this invention, one or another marker 
being preferred depending on the particular host and the manner of construction. 

For transformation of plant cells using Agrobacterium, explants may be 
combined and incubated with the transformed Agrobacterium for sufficient time for 
transformation, the bacteria killed, and the plant cells cultured in an appropriate 
selective medium. Once callus forms, shoot formation can be encouraged by 
employing the appropriate plant hormones in accordance with known methods and the 
shoots transferred to rooting medium for regeneration of plants. The plants may then 
be grown to seed and the seed used to establish repetitive generations and for isolation 
of vegetable oils. 

There are several possible ways to obtain the plant cells of this invention 
which contain multiple expression constructs. Any means for producing a plant 
comprising a construct having a DNA sequence encoding the expression construct of 
the present invention, and at least one other construct having another DNA sequence 
encoding an enzyme are encompassed by the present invention. For example, the 
expression construct can be used to.transform a plant at the same time as the second 
construct either by inclusion of both expression constructs in a single transformation 
vector or by using separate vectors, each of which express desired genes. The second 
construct can be introduced into a plant which has already been transformed with the 
prenyltransferase or tocopherol cyclase expression construct, or alternatively, 
transformed plants, one expressing the prenyltransferase or tocopherol cyclase 
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construct and one expressing the second construct, can be crossed to bring the 
constructs together in the same plant. 

Transgenic plants of the present invention may be produced from tissue 
culture, and subsequent generations grown from seed. Alternatively, transgenic plants 
5 may be grown using apomixis. Apomixis is a genetically controlled method of 
reproduction in plants where the embryo is formed without union of an egg and a 
sperm. There are three basic types of apomictic reproduction: 1) apospory where the 
embryo develops from a chromosomally unreduced egg in an embryo sac derived 
from the nucleus, 2) diplospory where the embryo develops from an unreduced egg in 
1 0 an embryo sac derived from the megaspore mother cell, and 3) adventitious embryony 
where the embryo develops directly from a somatic cell. In most forms of apomixis, 
pseudogamy or fertilization of the polar nuclei to produce endosperm is necessary for 
seed viability. In apospory, a nurse cultivar can be used as a pollen source for 
endosperm formation in. seeds. The nurse cultivar does not affect the genetics of the 
15 aposporous apomictic cultivar since the unreduced egg of the cultivar develops 
parthenogenetically, but makes possible endosperm production. Apomixis is 
economically important, especially in transgenic plants, because it causes any 
genotype, no matter how heterozygous, to breed true. Thus,with apomictic 
reproduction, heterozygous transgenic plants can maintain their genetic fidelity 
2 0 throughout repeated life cycles. Methods for the production of apomictic plants are 
known in the art. See, U.S. Patent No.5,81 1,636, which is herein incorporated by 
reference in its entirety. 

The nucleic acid sequences of the present invention can be used in constructs 
to provide for the expression of the sequence in a variety of host cells, both 
2 5 prokaryotic eukaryotic. Host cells of the present invention preferably include 
monocotyledenous and dicotyledenous plant cells. 

In general, the skilled artisan is familiar with the standard resource materials 
which describe specific conditions and procedures for the construction, manipulation 
and isolation of macromolecules (e.g., DNA molecules, plasmids, etc.), generation of 
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recombinant organisms and the screening and isolating of clones, (see for example, 
Sambrook et ai, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Press (1989); Maliga et ai, Methods in Plant Molecular Biology, Cold Spring Harbor 
Press (1995), the entirety of which is herein incorporated by reference; Birren et ai, 
5 Genome Analysis: Analyzing DNA, 1 , Cold Spring Harbor, New York, the entirety of 
which is herein incorporated by reference). 

Methods for the expression of sequences in insect host cells are known in the 
art. Baculovirus expression vectors are recombinant insect viruses in which the coding 
sequence for a chosen foreign gene has been inserted behind a baculovirus promoter 
10 in place of the viral gene, e.g., polyhedrin (Smith and Summers, U.S. Pat. No., 
4,745,051, the entirety of which is incorporated herein by reference). Baculovirus 
expression vectors are known in the art, and are described for example in Doerfler, 
Curr. Top. Microbiol. Immunol. 131:51-68 (1968); Luckowand Summers, 
Bio/Technology 5:47-55 (1988a); Miller, Annual Review of Microbiol. ¥2:177-199 
15 (1988); Summers, Curr. Comm. Molecular Biology, Cold Spring Harbor Press, Cold 
Spring Harbor, N.Y. (1988); Summers and Smith, A Manual of Methods for 
Baculovirus Vectors and Insect Cell Culture Procedures, Texas Ag. Exper. Station 
Bulletin No. 1555 (1988), the entireties of which is herein incorporated by reference) 
Methods for the expression of a nucleic acid sequence of interest in a fungal 
2 0 host cell are known in the art. The fungal host cell may, for example, be a yeast cell or 
a filamentous fungal cell. Methods for the expression of DNA sequences of interest in 
yeast cells are generally described in "Guide to yeast genetics and molecular biology", 
Guthrie and Fink, eds. Methods in enzymology , Academic Press, Inc. Vol 194 (1991) 
and Gene expression technology", Goeddel ed, Methods in Enzymology, Academic 
25 Press, Inc., Vol 185 (1991). 

Mammalian cell lines available as hosts for expression are known in the art 
and include many immortalized cell lines available from the American Type Culture 
Collection (ATCC, Manassas, VA), such as HeLa cells, Chinese hamster ovary 
(CHO) cells, baby hamster kidney (BHK) cells and a number of other cell lines. 
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Suitable promoters for mammalian cells are also known in the art and include, but are 
not limited to, viral promoters such as that from Simian Virus 40 (SV40) (Fiers et ah, 
Nature 273:1 13 (1978), the entirety of which is herein incorporated by reference), 
Rous sarcoma virus (RSV), adenovirus (ADV) and bovine papilloma virus (BPV). 
5 Mammalian cells may also require terminator sequences and poly-A addition 

sequences. Enhancer sequences which increase expression may also be included and 
sequences which promote amplification of the gene may also be desirable (for 
example methotrexate resistance genes). 

Vectors suitable for replication in mammalian cells are well known in the art, 
10 and may include viral replicons, or sequences which insure integration of the 

appropriate sequences encoding epitopes into the host genome. Plasmid vectors that 
greatly facilitate the construction of recombinant viruses have been described (see, for 
example, Mackett et al, J Virol. 49-.Z51 (1984); Chakrabarti et al.,Mol. Cell. Biol. 
5:3403 (1985); Moss, In: Gene Transfer Vectors For Mammalian Cells (Miller and 
15 Calos, eds., Cold Spring Harbor Laboratory, N.Y., p. 10, (1987); all of which are 
herein incorporated by reference in their entirety). 

The invention also includes plants and plant parts, such as seed, oil and meal 
derived from seed, and feed and food products processed from plants, which are 
enriched in tocopherols. Of particular interest is seed oil obtained from transgenic 
2 o plants where the tocopherol level has been increased as compared to seed oil of a non- 
transgenic plant. 

The harvested plant material may be subjected to additional processing to 
further enrich the tocopherol content. The skilled artisan will recognize that there are 
many such processes or methods for refining, bleaching and degumming oil. United 
25 States Patent Number 5,932,261, issued August 3, 1999, discloses on such process, 
for the production of a natural carotene rich refined and deodorised oil by subjecting 
the oil to a pressure of less than 0.060 mbar and to a temperature of less than 
200.degree. C. Oil distilled by this process has reduced free fatty acids, yielding a 
refined, deodorised oil where Vitamin E contained in the feed oil is substantially 
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retained in the processed oil. The teachings of this patent are incorporated herein by 
reference. 



The invention now being generally described, it will be more readily 
5 understood by reference to the following examples which are included for purposes of 
illustration only and are not intended to limit the present invention. 

EXAMPLES 

Example 1: Identification of Prenyltransferase or tocopherol cyclase Sequences 
10 PSI-BLAST (Altschul, etal. (1 997) NucAcidRes 25:3389-3402) profiles 

were generated for both the straight chain and aromatic classes of prenyltransferases. 
To generate the straight chain profile, a prenyl- transferase from Porphyr a purpurea 
(Genbank accession 1709766) was used as a query against the NCBI non-redundant 
protein database. The E. coli enzyme involved in the formation of ubiquinone, ubiA 
1 5 (genbank accession 1 790473) was used as a starting sequence to generate the aromatic 
prenyltransferase profile. These profiles were used to search public and proprietary 
DNA and protein data bases. InArabidopsis six putative prenyltransferases of the 
straight-chain class were identified, ATPT1, (SEQ ID NO:9), ATPT7 (SEQ ID 
NO:10), ATPT8 (SEQ ID NO:l I), ATPT9 (SEQ ID NO:13), ATPT10 (SEQ ID 
NO:14), and ATPT11 (SEQ ID NO:15), and six were identified of the aromatic class, 
ATPT2 (SEQ ID NO:l), ATPT3 (SEQ ID NO:3), ATPT4 (SEQ ID NO:5), ATPT5 
(SEQ ID NO:7), ATPT6 (SEQ ID NO:8), and ATPT12 (SEQ ID NO:16). Additional 
prenyltransferase sequences from other plants related to the aromatic class of 
prenyltransferases, such as soy (SEQ ID NOs: 19-23, the deduced amino acid 
sequence of SEQ ID NO:23 is provided in SEQ ID NO:24) and maize (SEQ ID 
NOs:25-29, and 3 1) are also identified. The deduced amino acid sequence of ZMPT5 
(SEQ ID NO:29) is provided in SEQ ID NO:30. 

Searches are performed on a Silicon Graphics Unix computer using additional 
Bioaccellerator hardware and GenWeb software supplied by Compugen Ltd. This 
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software and hardware enables the use of the Smith- Waterman algorithm in searching 
DNA and protein databases using profiles as queries. The program used to query 
protein databases is profilesearch. This is a search where the query is not a single 
sequence but a profile based on a multiple alignment of amino acid or nucleic acid 
5 sequences. The profile is used to query a sequence data set, i.e., a sequence database. 
The profile contains all the pertinent information for scoring each position in a 
sequence, in effect replacing the "scoring matrix" used for the standard query 
searches. The program used to query nucleotide databases with a protein profile is 
tprofilesearch. Tprofilesearch searches nucleic acid databases using an amino acid 
1 0 profile query. As the search is running, sequences in the database are translated to 
amino acid sequences in six reading frames. The output file for tprofilesearch is 
identical to the output file for profilesearch except for an additional column that 
indicates the frame in which the best alignment occurred. 

The Smith-Waterman algorithm, (Smith and Waterman (1981) supra), is used 
15 to search for similarities between one sequence from the query and a group of 
sequences contained in the database. E score values as well as other sequence 
information, such as conserved peptide sequences are used to identify related 
sequences. 

To obtain the entire coding region corresponding to the Arabidopsis 
2 0 prenyltransferase sequences, synthetic oligo-nucleotide primers are designed to 
amplify the 5' and 3' ends of partial cDNA clones containing prenyltransferase 
sequences. Primers are designed according to the respective Arabidopsis 
prenyltransferase sequences and used in Rapid Amplification of cDNA Ends (RACE) 
reactions (Frohman et al. (1988) Ptoc. Natl. Acad. Sci. USA 85:8998-9002) using the 
Marathon cDNA amplification kit (Clontech Laboratories Inc, Palo Alto, CA). 

Amino acid sequence alignments between ATPT2 (SEQ ED NO:2), ATPT3 
(SEQ ID NO:4), ATPT4 (SEQ ID NO:6), ATPT8 (SEQ ID NO:12), and ATPT12 
(SEQ ID NO: 17) are performed using ClustalW (Figure 1), and the percent identity 
and similarities are provided in Table 1 below. 
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Table 1: 
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Example 2: Preparation of Prenyl Transferase Expression Constructs 
5 A plasmid containing the napin cassette derived from pCGN3223 (described 

in USPN 5,639,790, the entirety of which is incorporated herein by reference) was 
modified to make it more useful for cloning large DNA fragments containing multiple 
restriction sites, and to allow the cloning of multiple napin fusion genes into plant 
binary transformation vectors. An adapter comprised of the self annealed 
10 oligonucleotide of sequence 

CGCGATTTAAATGGCGCGCCCTGCAGGCGGCCGCCTGCAGGGCGCGCCAT 
TTAAAT (SEQ ID NO:40) was ligated into the cloning vector pBC SK+ (Stratagene) 
after digestion with the restriction endonuclease BssHII to construct vector 
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pCGN7765. Plamids pCGN3223 and pCGN7765 were digested with NotI and 
ligated together. The resultant vector, pCGN7770, contains the pCGN7765 backbone 
with the napin seed specific expression cassette from pCGN3223. 

The cloning cassette, pCGN7787, essentially the same regulatory elements as 
5 pCGN7770, with the exception of the napin regulatory regions of pCGN7770 have 
been replaced with the double CAMV 35S promoter and the tml polyadenylation and 
transcriptional termination region. 

A binary vector for plant transformation, pCGN5139, was constructed from 
pCGN1558 (McBride and Summerfelt, (1990) Plant Molecular Biology, 14:269-276). 
10 The polylinker of pCGN1558 was replaced as a Hindin/Asp718 fragment with a 
polylinker containing unique restriction endonuclease sites, AscI, Pad, Xbal, Swal, 
BamHI, and NotI. The Asp718 and Hindffl restriction endonuclease sites are retained 
inpCGN5139. 

A series of turbo binary vectors are constructed to allow for the rapid cloning 
15 of DNA sequences into binary vectors containing transcriptional initiation regions 
(promoters) and transcriptional termination regions. 

The plasmidpCGN8618 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3 ' (SEQ ID NO:41) and 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC-3 5 (SEQ ID NO:42) into 
2 0 Sall/XhoI-digested pCGN7770. A fragment containing the napin promoter, 
polylinker and napin 3* region was excised from pCGN861 8 by digestion with 
Asp71 81; the fragment was blunt-ended by filling in the 5' overhangs with Klenow 
fragment then ligated into pCGN5 1 39 that had been digested with Asp71 81 and 
Hindin and blunt-ended by filling in the 5' overhangs with Klenow fragment A 
2 5 plasmid containing the insert oriented so that the napin promoter was closest to the 
blunted Asp718I site of pCGN5139 and the napin 3' was closest to the blunted 
Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8622. 
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The plasmid pCGN86 1 9 was constructed by ligating oligonucleotides 5 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCC -3' (SEQ ID NO:43) and 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGG-3 ' (SEQ ID NO:44) into 
Sall/XhoI-digested pCGN7770. A fragment containing the napin promoter, 
polylinker and napin 3 ' region was removed from pCGN861 9 by digestion with 
Asp718I; the fragment was blunt-ended by filling in the 5' overhangs with Klenow 
fragment then ligated into pCGN5139 that had been digested with Asp718I and 
Hindlll and blunt-ended by filling in the 5 7 overhangs with Klenow fragment. A 
plasmid containing the insert oriented so that the napin promoter was closest to the 
blunted Asp718I site of pCGN5139 and the napin 3' was closest to the blunted 
Hindlll site was subjected to sequence analysis to confirm both the insert orientation 
and the integrity of cloning junctions. The resulting plasmid was designated 
pCGN8623. 

The plasmid pCGN8620 was constructed by ligating oligonucleotides 5'- 
TCGAGGATCCGCGGCCGCAAGCTTCCTGCAGGAGCT -3' (SEQ ID NO:45) 
and 5 ' -CCTGC AGGAAGCTTGCGGCCGCGGATCC-3 ' (SEQ ID NO:46) into 
Sall/SacI-digested pCGN7787. A fragment containing the d35S promoter, polylinker 
and tml 3' region was removed from pCGN8620 by complete digestion with Asp718I 
and partial digestion with Notl. The fragment was blunt-ended by filling in the 5' 
overhangs with Klenow fragment then ligated into pCGN5 1 39 that had been digested 
with Asp71-8I and Hindlll and blunt-ended by filling in the 5' overhangs with Klenow 
fragment. A plasmid containing the insert oriented so that the d35S promoter was 
closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest to the 
blunted Hindlll site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was 
designated pCGN8624. 

The plasmid pCGN8621 was constructed by ligating oligonucleotides 5'- 
TCGACCTGCAGGAAGCTTGCGGCCGCGGATCCAGCT -3' (SEQ ID NO:47) 
and 5 '-GGATCCGCGGCCGC AAGCTTCCTGCAGG-3 ' (SEQ ID NO:48) into 
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Sall/SacI-digested pCGN7787. A fragment containing the d35S promoter, polylinker 
and tml 3' region was removed from pCGN8621 by complete digestion with Asp718I 
and partial digestion with NotL The fragment was blunt-ended by filling in the 5' 
overhangs with Klenow fragment then ligated into pCGN5139 that had been digested 
5 with Asp71 81 and Hindlll and blunt-ended by filling in the 5 9 overhangs with Klenow 
fragment. A plasmid containing the insert oriented so that the d35S promoter was 
closest to the blunted Asp718I site of pCGN5139 and the tml 3' was closest to the 
blunted Hindm site was subjected to sequence analysis to confirm both the insert 
orientation and the integrity of cloning junctions. The resulting plasmid was 
1 0 designated pCGN8625 . 

The plasmid construct pCGN8640 is a modification of pCGN8624 described 
above. A 938bp PstI fragment isolated from transposon Tn7 which encodes bacterial 
spectinomycin and streptomycin resistance (Fling et al. (1 985), Nucleic Acids 
Research 13(19):7095-7106), a determinant for E. coli and Agrobacterium selection, 

1 5 was blunt ended with Pfix polymerase. The blunt ended fragment was ligated into 
pCGN8624 that had been digested with Spel and blunt ended with Pfu polymerase. 
The region containing the PstI fragment was sequenced to confirm both the insert 
orientation and the integrity of cloning junctions. 

The spectinomycin resistance marker was introduced into pCGN8622 and 

2 0 pCGN8623 as follows. A 7.7 Kbp Avrll-SnaBI fragment from pCGN8640 was 
ligated to a 10.9 Kbp Avrll-SnaBI fragment from pCGN8623 or pCGN8622, 
described above. The resulting plasmids were pCGN8641 and pCGN8643, 
respectively. 

The plasmid pCGN8644 was constructed by ligating oligonucleotides 5'- 
2 5 GATCACCTGCAGGAAGCTTGCGGCCGCGGATCCAATGC A-3 ' (SEQ ID 
NO.-49) and 5'- TTGGATCCGCGGCCGCAAGCTTCCTGCAGGT-3 ' (SEQ ID 
NO:50) into BamHI-PstI digested pCGN8640. 
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Synthetic oligonulceotides were designed for use in Polymerase Chain Reactions 
(PCR) to amplify the coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and 
ATPT12 for the preparation of expression constructs and are provided in Table 2 below. 

5 Table 2: 



Name 


Restriction 


Sequence 


SEQ 




Site 




TT"\ X T/~\ . 

ID NO: 


ATPT2 


5'NotI 


GGATCCGCGGCCGCACAATGGAGTC 


51 






TCTGCTCTCTAGTTCT 




ATPT2 


3' Ssel 


GGATCCTGCAGGTCACTTCAAAAAA 


52 






GGTAACAGCAAGT 




ATPT3 


5'NotI 


GGATCCGCGGCCGCACAATGGCGTT 


53 






T1T1GGGCTCTCCCGTGTTT 




ATPT3 


3 9 Ssel 


GGATCCTGCAGGTTATTGAAAACTT 


54 






CTTCCAAGTACAACT 




ATPT4 


5 9 NotI 


GGATCCGCGGCCGCACAATGTGGCG 


55 


4 




AAGATCTGTTGTT 




ATPT4 


3 9 Ssel 


GGATCCTGCAGGTCATGGAGAGTAG 


56 






AAGGAAGGAGCT 




ATPT8 


5'NotI 


GGATCCGCGGCCGCACAATGGTACT 


57 


ATPT8 




TGCCGAGGTTCCAAAGCTTGCCTCT 




3' Ssel 


GGATCCTGCAGGTCACTTGTTTCTGG 


58 






TGATGACTCTAT 




ATPT12 


5'NotI 


GGATCCGCGGCCGCACAATGACTTC 


59 






GATTCTCAACACT 




ATPT12 


3* Ssel 


GGATCCTGCAGGTCAGTGTTGCGAT 


60 






GCTAATGCCGT 





The coding sequences of ATPT2, ATPT3, ATPT4, ATPT8, and ATPT12 were all 
amplified using the respective PCR primers shown in Table 2 above and cloned into the 
TopoTA vector (Invitrogen). Constructs containing the respective prenyltransferase 
10 sequences were digested with NotI and Sse8387I and cloned into the turbobinary vectors 
described above. 

The sequence encoding ATPT2 prenyltransferase was cloned in the sense 
orientation into pCGN8640 to produce the plant transformation construct pCGN10800 
(Figure 2). The ATPT2 sequence is under control of the 35S promoter. 
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The ATPT2 sequence was also cloned in the antisense orientation into the 
construct pCGN8641 to create pCGN10801 (Figure 3). This construct provides for the 
antisense expression of the ATPT2 sequence from the napin promoter. 

The ATPT2 coding sequence was also cloned in the sense orientation into the 
5 vector pCGN8643 to create the plant transformation construct pCGNl 0822 

The ATPT2 coding sequence was also cloned in the antisense orientation into the 
vector pCGN8644 to create the plant transformation construct pCGN10803 (Figure 4). 

The ATPT4 coding sequence was cloned into the vector pCGN864 to create the 
plant transformation construct pCGN10806 (Figure 5). The ATPT2 coding sequence was 
10 cloned into the vector TopoTA ™ vector from Invitrogen, to create the plant 

transformation construct pCGN10807(Figure 6). The ATPT3 coding sequence was cloned 
into the TopoTA vector to create the plant transformation construct pCGN10808 (Figure 
7). The ATPT3 coding sequence was cloned in the sense orientation into the vector 
pCGN8640 to create the plant transformation construct pCGN10809 (Figure 8). The 
15 ATPT3 coding sequence was cloned in the antisense orientation into the vector 

pCGN8641 to create the plant transformation construct pCGN10810 (Figure 9). The 
ATPT3 coding sequence was cloned into the vector pCGN8643 to create the plant 
transformation construct pCGN1081 1 (Figure 10). The ATPT3 coding sequence was 
cloned into the vector pCGN8644 to create the plant transformation construct 
2 0 pCGNl 08 1 2 (Figure 1 1). The ATPT4 coding sequence was cloned into the vector 
pCGN8640 to create the plant transformation construct pCGN10813 (Figure 12). The 
ATPT4 coding sequence was cloned into the vector pCGN8641 to create the plant 
transformation construct pCGN10814 (Figure 13). The ATPT4 coding sequence was 
cloned into the vector pCGN8643 to create the plant transformation construct 
2 5 pCGNl 08 1 5 (Figure 14). The ATPT4 coding sequence was cloned in the antisense 
orientation into the vector pCGN8644 to create the plant transformation construct 
pCGN10816 (Figure 15). The ATPT8 coding sequence was cloned in the sense 
orientation into the vector pCGN8643 to create the plant transformation construct 
pCGN10819 (Figure 17). The ATPT12 coding sequence was cloned into the vector 
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pCGN8640 to create the plant transformation construct pCGN10824 (Figure 18). The 
ATPT12 coding sequence was cloned into the vector pCGN8643 to create the plant 
transformation construct pCGN10825 (Figure 19). The ATPT8 coding sequence was 
cloned into the vector pCGN8640 to create the plant transformation construct 
5 pCGN10826 (Figure 20). 

Example 3: Plant Transformation with Prenyl Transferase Constructs 

Transgenic Brassica plants are obtained by Agrobacterium-mediated 
transformation as described by Radke et al (Theor. Appl Genet (1988) 75:685-694; 

1 0 Plant Cell Reports (1 992) 1 1 :499-505). Transgenic Arabidopsis thaliana plants may 
be obtained by Agrobacterium-mzdiated transformation as described by Valverkens et 
al 9 (Proc Nat Acad. Set (1988) 55:5536-5540), or as described by Bent et al. 
((1994), Science 265:1856-1860), or Bechtold et al. ((1993), CRAcadSci, Life 
Sciences 3 1 6 : 1 1 94- 1 1 99). Other plant species may be similarly transformed using 

1 5 related techniques. 

Alternatively, microprojectile bombardment methods, such as described by 
Klein et al (Bio/Technology 70:286-291) may also be used to obtain nuclear 
transformed plants. 

2 0 Example 4: Identification of Additional Prenyltransferases 

Additional BLAST searches were performed using the ATPT2 sequence, a 
sequence in the class of aromatic prenyltransferases. ESTs, and in some case, full- 
length coding regions, were identified in proprietary DNA libraries. 

Soy full-length homologs to ATPT2 were identified by a combination of 
2 5 BLAST (using ATPT2 protein sequence) and 5' RACE. Two homologs resulted 

(SEQ ID NO:95 and SEQ ID NO:96). Translated amino acid sequences are provided 
by SEQ ID NO:97 and SEQ ID NO:98. 

A rice est ATPT2 homolog is shown in SEQ ED NO:99 (obtained from 
BLAST using the wheat ATPT2 homolog). 
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Other homolog sequences were obtained using ATPT2 and PSI-BLAST, 
including est sequences from wheat (SEQ ID NO: 100), leek (SEQ ID NOsrlOl and 
102), canola (SEQ ID NO: 103), com (SEQ ID NOs:104, 105 and 106), cotton (SEQ 
ID NO: 107) and tomato (SEQ ID NO: 108). 
5 A PSI-Blast profile generated using the E. coli ubiA (genbank accession 

1790473) sequence was used to analyze the Synechocystis genome. This analysis 
identified 5 open reading frames (ORFs) in the Synechocystis genome that were 
potentially prenyltransferases; slr0926 (annotated as ubiA (4-hydroxybenzoate- 
octaprenyltransferase, SEQ ID NO:32), sill 899 (annotated as ctaB (cytocrome c 
10 oxidase folding protein, SEQ ID NO:33), slr0056 (annotated as g4 (chlorophyll 

synthase 33 kd subunit, SEQ ED NO:34), slrl518 (annotated as menA (menaquinone 
biosynthesis protein, SEQ ID NO:35), and slrl736 (annotated as a hypothetical 
protein of unknown function (SEQ ID NO:36). 



15 4 A. Synechocystis Knock-outs 

To determine the functionality of these ORFs and their involvement, if any, in 
the biosynthesis of tocopherols, knockouts constructs were made to disrupt the ORF 
identified in Synechocystis. 

Synthetic oligos were designed to amplify regions from the 5' (5 ? - 
2 0 TAATGTGTACATTGTCGGCCTC (17365') (SEQ ID NO:61) and 5'- 
GCAATGTAACATCAGAGATTTTGAGAC 
CCGCACCGTC (1736kanprl)) (SEQ ID NO:62) and 3* (5'- 
AGGCTAATAAGCACAAATGGGA (17363') (SEQ ID NO:63) and 5'- 

GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGC 
25 GGAATTGGTTTAGGTTATCCC (1736kanpr2)) (SEQ ID NO:64) ends of the 

slrl736 ORF. The 1736kanprl and 1736kanpr2 oligos contained 20 bp of homology 
to the slrl736 ORF with an additional 40 bp of sequence homology to the ends of the 
kanamycin resistance cassette. Separate PCR steps were completed with these oligos 
and the products were gel purified and combined with the kanamycin resistance gene 
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from puc4K (Pharmacia) that had been digested with Hindi and gel purified away 
from the vector backbone. The combined fragments were allowed to assemble 
without oligos under the following conditions: 94°C for 1 min, 55°c for 1 min. 72°C 
for 1 min plus 5 seconds per cycle for 40 cycles using pfu polymerase in lOOul 
5 reaction volume (Zhao, H and Arnold (1997) Nucleic Acids Res. 25(6):1307-1308). 
One microliter or five microliters of this assembly reaction was then amplified using 
5' and 3' oligos nested within the ends of the ORF fragment, so that the resulting 
product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked 
out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be 
1 0 knocked out This PCR product was then cloned into the vector pGemT easy 
(Promega) to create the construct pMON2 1 68 1 and used for Synechocystis 
transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
15 following primers. The ubiA 5' sequence was amplified using the primers 5'- 
GGATCCATGGTT GCCCAAACCCCATC (SEQ ID NO:65) and 5'- 
GCAATGTAACATCAGAGA TTTTGAGACACAACG 

TGGCTTTGGGTAAGCAACAATGACCGGC (SEQ ID NO:66). The 3' region was 
amplified using the synthetic oligonucleotide primers 5'- 

2 0 GAATTCTCAAAGCCAGCCCAGTAAC (SEQ ID NO:67) and 5'-GGTATGAGTC 

AGCAACACCTTCTTCACGAGGCAGACCTCAGCGGGTGCGAAAAGGGTTTT 
CCC (SEQ ID NO:68). The amplification products were combined with the 
kanamycin resistance gene from puc4K (Pharmacia) that had been digested with 
HincU and gel purified away from the vector backbone. The annealed fragment was 
amplified using 5' and 3' oligos nested within the ends of the ORF fragment (5'- 
CCAGTGGTTTAGGCTGTGTGGTC (SEQ ID NO:69) and 5'- 
CTGAGTTGGATGTATTGGATC (SEQ ID NO:70)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked 
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out. This PCR product was then cloned into the vector pGemT easy (Promega) to 
create the construct pMON21682 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
5 following primers. The sll 1 899 5' sequence was amplified using the primers 5'- 
GGATCCATGGTTACTT CGACAAAAATCC (SEQ ID NO:71) and 5'- 
GCAATGTAACATCAGAG 

ATTTTGAGACACAACGTGGCTTTGCTAGGCAACCGCTTAGTAC (SEQ ED 
NO:72). The 3' region was amplified using the synthetic oligonucleotide primers 5 9 - 
10 GAATTCTTAACCCAACAGTAAAGTTCCC (SEQ ID NO:73) and 5'- 
GGTATGAGTCAGC 

AACACCTTCTTCACGAGGCAGACCTCAGCGCCGGCATTGTCTTTTACATG 
(SEQ ID NO:74). The amplification products were combined with the kanamycin 
resistance gene from puc4K (Pharmacia) that had been digested with Hindi and gel 
15 purified away from the vector backbone. The annealed fragment was amplified using 
5' and 3' oligos nested within the ends of the ORF fragment (5'- 
GGAACCCTTGCAGCCGCTTC (SEQ ID NO:75) 

and 5'- GTATGCCCAACTGGTGCAGAGG (SEQ ID NO:76)), so that the resulting 
product contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked 
2 0 out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be 
knocked out. This PCR product was then cloned into the vector pGemT easy 
(Promega) to create the construct pMON21679 and used for Synechocystis 
transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
2 5 constructs for the other sequences using the same method as described above, with the 
following primers. The slr0056 5' sequence was amplified using the primers 5'- 
GGATCCATGTCTGACACACAAAATACCG (SEQ ID NO:77) and 5'- 

GCAATGTAACATCAGAGATTTTGAGACACAACGTGGCTTTCGCCAATACC 
AGCCACCAACAG (SEQ ID NO:78). The 3' region was amplified using the 
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synthetic oligonucleotide primers 5'- GAATTCTCAAAT CCCCGCATGGCCTAG 
(SEQIDNO:79)and5 ? - 

GGTATGAGTCAGCAACACCTTCTTCACGAGGCAGACCTCAGCGGCCTACG 
GCTTGGACGTGTGGG (SEQ ID NO:80). The amplification products were 
combined with the kanamycin resistance gene from puc4K (Pharmacia) that had been 
digested with HincU and gel purified away from the vector backbone. The annealed 
fragment was amplified using 5 5 and 3' oligos nested within the ends of the ORF 
fragment (5'- CACTTGGATTCCCCTGATCTG (SEQ ID NO:81) and 5'- 
GCAATACCCGCTTGGAAAACG (SEQ ID NO: 82)), so that the resulting product 
contained 100-200 bp of the 5' end of the Synechocystis gene to be knocked out, the 
kanamycin resistance cassette, and 100-200 bp of the 3' end of the gene to be knocked 
out. This PCR product was then cloned into the vector pGemT easy (Promega) to 
create the construct pMON21677 and used for Synechocystis transformation. 

Primers were also synthesized for the preparation of Synechocystis knockout 
constructs for the other sequences using the same method as described above, with the 
following primers. The slrl518 5' sequence was amplified using the primers 5'- 
GGATCCATGACCGAAT CTTCGCCCCTAGC (SEQ ID NO:83) and 5'- 
GCAATGTAACATCAGAGATTTTGA GACACAACGTGGC 
TTTCAATCCTAGGTAGCCGAGGCG (SEQ ID NO:84). The 3' region was 
amplified using the synthetic oligonucleotide primers 5 s - 
GAATTCTTAGCCCAGGCC AGCCCAGCC (SEQ ID NO:85)and 5'- 
GGTATGAGTCAGCAACACCTTCTTCACGA 

GGCAGACCTCAGCGGGGAATTGATTTGTTTAATTACC (SEQ ID NO:86). The 
amplification products were combined with the kanamycin resistance gene from 
puc4K (Pharmacia) that had been digested with HincU and gel purified away from the 
vector backbone. The annealed fragment was amplified using 5' and 3 ' oligos nested 
within the ends of the ORF fragment (5'- GCGATCGCCATTATCGCTTGG (SEQ ID 
NO:87) and 5'- GCAGACTGGCAATTATCAGTAACG (SEQ ID NO:88)), so that 
the resulting product contained 100-200 bp of the 5' end of the Synechocystis gene to 
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be knocked out, the kanamycin resistance cassette, and 100-200 bp of the 3' end of the 
gene to be knocked out This PCR product was then cloned into the vector pGemT 
easy (Promega) to create the construct pMON21680 and used for Synechocystis 
transformation. 

5 

4B. Transformation of Synechocystis 

Cells of Synechocystis 6803 were grown to a density of approximately 2x10 s 
cells per ml and harvested by centrifugation. The cell pellet was re-suspended in fresh 
BQ-1 1 medium (ATCC Medium 616) at a density of 1x10 s cells per ml and used 

10 immediately for transformation. One-hundred microliters of these cells were mixed 
with 5 ul of mini prep DNA and incubated with light at 30C for 4 hours. This mixture 
was then plated onto nylon filters resting on BG-1 1 agar supplemented with TES pH8 
and allowed to grow for 12-18 hours. The filters were then transferred to BG-1 1 agar 
+ TES + 5ug/ml kanamycin and allowed to grow until colonies appeared within 7-10 

15 days (Packer and Glazer, 1988). Colonies were then picked into BG-1 1 liquid media 
containing 5 ug/ml kanamycin and allowed to grow for 5 days. These cells were then 
transferred to Bg-1 1 media containing lOug/ml kanamycin and allowed to grow for 5 
days and then transferred to Bg-1 1 + kanamycin at 25ug/ml and allowed to grow for 5 
days. Cells were then harvested for PCR analysis to deterrnine the presence of a 

2 0 disrupted ORF and also for HPLC analysis to determine if the disruption had any 
effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates for slrl736 and sill 899 showed 
complete segregation of the mutant genome, meaning no copies of the wild type 
genome could be detected in these strains. This suggests that function of the native 
2 5 gene is not essential for cell function. HPLC analysis of these same isolates showed 
that the sill 899 strain had no detectable reduction in tocopherol levels. However, the 
•strain carrying the knockout for slr!736 produced no detectable levels of tocopherol. 

The amino acid sequences for the Synechocystis knockouts are compared using 
ClustalW, and are provided in Table 3 below. Provided are the percent identities, 
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percent similarity, and the percent gap. The alignment of the sequences is provided i 
Figure 21. 



Table 3: 





Slrl736 slr0926 


sill 899 


sk-0056 


slrl518 


—I—I'M ^* (\ / ' A . • , 

slrl73o %identity 


14 


12 


18 


11 


%similar 


29 


30 


34 


26 


%gap 


8 


7 


10 


5 


slr0926 %identity 




20 


19 


14 


%similar 




39 


32 


28 


%gap 




7 


9 


4 


sU1899%identity 






17 


13 


%similar 






29 


29 


%gap 






12 


9 


slr0056 %identity 








15 


%similar 








31 


%gap 








8 


slrl518%identity 










%similar 










< 

%gap 











Amino acid sequence comparisons are performed using various Arabidopsis 
prenyltransferase sequences and the Synechocystis sequences. The comparisons are 
presented in Table 4 below. Provided are the percent identities, percent similarity, 
and the percent gap. The alignment of the sequences is provided in Figure 22. 
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4C. Phytyl Prenyltransferase Enzyme Assays 

[ 3 H] Homogentisic acid in 0.1% H 3 P0 4 (specific radioactivity 40 Ci/mmol). 
Phytyl pyrophosphate was synthesized as described by Joo, et al (1 973) Can J. 
5 Biochem. 51:1527. 2-methyl-6-phytylquinol and 2,3-dimethyl-5-phytylquinol were 
synthesized as described by Soil, et al (1980) Phytochemistry 19:215. Homogentisic 
acid, a, P, 5, and y-tocopherol, and tocol, were purchased commercially. 

The wild-type strain of Synechocystis sp. PCC 6803 was grown in BG1 1 medium 
with bubbling air at 30°C under 50 p.E.m- 2 .s" 1 fluorescent light, and 70% relative 
1 0 humidity. The growth medium of slrl 736 knock-out (potential PPT) strain of this 
organism was supplemented with 25 yg ml/ 1 kanamycin. Cells were collected from 
0.25 to 1 liter culture by centrifugation at 5000 g for 10 min and stored at -80°C. 

Total membranes were isolated according to Zak's procedures with some 
modifications (Zak, et al (\999)Ei{rl Biochem 261:311). Cells were broken on a 
15 French press. Before the French press treatment, the cells were incubated for 1 hour 
with lysozyme (0.5%, w/v) at 30 °C in a medium containing 7 mM EDTA, 5 mM NaCl 
and 10 mM Hepes-NaOH, pH 7.4. The spheroplasts were collected by centrifugation at 
5000 g for 10 min and resuspendedat 0.1 - 0.5 mg chlorophyll-mi/ 1 in 20 mM 
potassium phosphate buffer, pH 7.8. Proper amount of protease inhibitor cocktail and 
2 0 DNAase I from Boehringer Mannheim were added to the solution. French press 
treatments were performed two to three times at 100 MPa. After breakage, the cell 
suspension was centrifiigedfor 10 min at 5000g to pellet unbroken cells, and this was 
followed by centrifugation at 100 000 g for 1 hour to collect total membranes. The final 
pellet was resuspended in a buffer containing 50 mM Tris-HCL and 4 mM MgCl 2 . 
2 5 Chloroplast pellets were isolated from 250 g of spinach leaves obtained from local 

markets. Devined leaf sections were cut into grinding buffer (2 1 /250 g leaves) 
containing 2 mM EDTA, 1 mM MgCI 2 , 1 mM MnCI 2 , 0.33 M sorbitol, 0.1% ascorbic 
acid, and 50 mM Hepes at pH 7.5. The leaves were homogenized for 3 sec three times 
in a 1-L blendor, and filtered through 4 layers of mirocloth. The supernatant was then 
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centrifuged at 5000^ for 6 min. The chloroplast pellets were resuspended in small 
amount of grinding buffer (Douce,ef al Methods in Chloroplast Molecular Biology, 
239 (1982) 

Chloroplasts in pellets can be broken in three ways. Chloroplast pellets were first 
5 aliquoted in 1 mg of chlorophyll per tube, centrifuged at 6000 rpm for 2 min in 

microcentrifuge, and grinding buffer was removed. Two hundred microliters of Triton 
X-100 buffer (0.1% TritonX-100, 50 mM Tris-HCl pH 7.6 and 4 mM MgCy or 
swelling buffer (1 0 mM Tris pH 7.6 and 4 mM MgCl^ was added to each tube and 
incubated for 14 hour at 4°C. Then the broken chloroplast pellets were used for the 
1 0 assay immediately. In addition, broken chloroplasts can also be obtained by freezing 
in liquid nitrogen and stored at -80°C for 14 hour, then used for the assay. 

In some cases chloroplast pellets were further purified with 40%/ 80% percoll 
gradient to obtain intact chloroplasts. The intact chloroplasts were broken with 
swelling buffer, then either used for assay or further purified for envelope membranes 
15 with 20.5%/ 3 1 .8% sucrose density gradient (Sol, et al (1 980) supra). The membrane 
fractions were centrifuged at 100 OOOg for 40 min and resuspended in 50 mM Tris-HCl 
pH 7.6, 4 mM MgCl 2 . 

Various amounts of [ 3 H]HGA, 40 to 60 uM unlabeled HGA with specific activity 
in the range of 0.1 6 to 4 Ci/mmole were mixed with a proper amount of 1M Tris- 

2 0 NaOH pH 1 0 to adjust pH to 7.6. HGA was reduced for 4 min with a trace amount of 
solid NaBHL,. In addition to HGA, standard incubation mixture (final vol 1 mL) 
contained 50 mM Tris-HCl, pH 7.6, 3-5 mM MgCl 2 , and 1 00 uM phytyl 
pyrophosphate. The reaction was initiated by addition of Synechocystis total 
membranes, spinach chloroplast pellets, spinach broken chloroplasts, or spinach 

2 5 envelope membranes. The enzyme reaction was carried out for 2 hour at 23°C or 30°C 
in the dark or light. The reaction is stopped by freezing with liquid nitrogen, and 
stored at -80°C or directly by extraction. 

A constant amount of tocol was added to each assay mixture and reaction products 
were extracted with a 2 mL mixture of chlorofonn/methanol (1:2, v/v) to give a 
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monophasic solution. NaCl solution (2 mL; 0.9%) was added with vigorous shaking. 
This extraction procedure was repeated three times. The organic layer containing the 
prenylquinones was filtered through a 20 mu filter, evaporated under and then 
resuspended in 100 uL of ethanol. 
5 The samples were mainly analyzed by Normal-Phase HPLC method (Isocratic 
90% Hexane and 10% Methyl-t-butyl ether), and use a Zorbax silica column, 4.6 x 250 
mm. The samples were also analyzed by Reversed-Phase HPLC method (Isocratic 
0.1% H 3 P0 4 in MeOH), and use a Vydac 201HS54 C18 column; 4.6 x 250 mm 
coupled with an All-tech CIS guard column. The amount of products were calculated 

1 0 based on the substrate specific radioactivity, and adjusted according to the % recovery 
based on the amount of internal standard. 

The amount of chlorophyll was determined as described in Arnon (1949) Plant 
Physiol. 24: 1 . Amount of protein was determined by the Bradford method using 
gamma globulin as a. standard (Bradford, (1976) Anal. Biochem. 72:248) 

15 Results of the assay demonstrate that 2-Methyl-6-Phytylplastoquinone is not 

produced in the Synechocystis slrl736 knockout preparations. The results of the 
phytyl prenyltransferase enzyme activity assay for the slrl736 knock out are presented 
in Figure 23. 

20 4D. Complementation of the slrl736 knockout with ATPT2 

In order to determine whether ATPT2 could complement the knockout of 
slrl736 in Synechocystis 6803, a plasmid was constructed to express the ATPT2 
sequence from the TAC promoter. A vector, plasmid psll21 1, was obtained from the 
lab of Dr. Himadri Pakxasi of Washington University, and is based on the plasmid 

25 RSF101 0 which is a broad host range plasmid (Ng W.-O., Zentella R., Wang, Y., 
Taylor J-S. A., Pakrasi, H.B. 2000. phrA, the major photoreactivating factor in the 
cyanobacterium Synechocystis sp. strain PCC 6803 codes for a cyclobutane 
pyrimidine dimer specific DNA photolyase. Arch. Microbiol, (in press)). The ATPT2 
gene was isolated from the vector pCGN10817 by PCR using the following primers. 
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ATPT2nco.pr 5 ' -CCATGGATTCGAGTAAAGTTGTCGC (SEQ ID NO:89); 
ATPT2ri.pr- 5'-GAATTCACTTCAAAAAAGGTAACAG (SEQ ID NO:90). These 
primers will remove approximately 1 12 BP from the 5' end of the ATPT2 sequence, 
which is thought to be the chloroplast transit peptide. These primers will also add an 
5 Ncol site at the 5 ' end and an EcoRI site at the 3 ' end which can be used for sub- 
cloning into subsequent vectors. The PCR product from using these primers and 
pCGN108 17 was ligated into pGEM T easy and the resulting vector pMON21 689 wa 
confirmed by sequencing using the ml 3 forward and ml3reverse primers. The 
NcoI/EcoRI fragment from pMON2l689 was then ligated with the Eagl/EcoRI and 
10 Eagl/Ncol fragments from psll21 1 resulting in pMON21690. The plasmid 

pMON21690 was introduced into the slrl736 Synechocystis 6803 KO strain via 
conjugation. Cells of sl906 (a helper strain) and DH10B cells containing 
pMON21690 were grown to log phase (O.D. 600= 0.4) and 1 ml was harvested by 
centrifugation. The cell pellets were washed twice with a sterile BG-1 1 solution and 
15 resuspended in 200 ul of BG-1 1 . The following was mixed in a sterile eppendorf 

tube: 50 ul SL906, 50 ul DH10B cells containing pMON21690, and 100 ul of a fresh 
culture of the slrl736 Synechocystis 6803 KO strain (O.D. 730 = 0.2-0.4). The cell 
mixture was immediately transferred to a nitrocellulose filter resting on BG-1 1 and 
incubated for 24 hours at 30C and 2500 LUX(50 ue) of light. The filter was then 
2 0 transferred to BG- 1 1 supplemented with 1 Oug/ml Gentamycin and incubated as above 
for ~5 days. When colonies appeared, they were picked and grown up in liquid BG- 
1 1 + Gentamycin 10 ug/ml. (Elhai, J. and Wolk, P. 1988. Conjugal transfer of DNA 
to Cyanobacteria. Methods in Enzymology 167, 747-54) The liquid cultures were then 
assayed for tocopherols by harvesting 1ml of culture by centrifugation, extracting with 
25 ethanol/pyrogallol, and HPLC separation. The slrl736 Synechocystis 6803 KO strain, 
did not contain any detectable tocopherols,- while the sir 173 6 Synechocystis 6803 KO 
strain transformed with pmon21690 contained detectable alpha tocopherol. A 
Synechocystis 6803 strain transformed with psll21 1 (vector control) produced alpha 
tocopherol as well. 
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4E: Additional Evidence of Prenyltransferase Activity 

To test the hypothesis that slrl736 or ATPT2 axe sufficient as single genes to 
obtain phytyl prenyltransferase activity, both genes were expressed in SF9 cells and in 
yeast. When either slrl736 or ATPT2 were expressed in insect cells (Table 5) or in 
yeast, phytyl prenyltransferase activity was detectable in membrane preparations-, 
whereas membrane preparations of the yeast vector control, or membrane preparations 
of insect cells did not exhibit phytyl prenyltransferase activity. 

Table 5: Phytyl prenyltransferase activity 



Example 5: Transgenic Plant Analysis 
5A. Arabidopsis 

Arabidopsis plants transformed with constructs for the sense or antisense 
expression of the ATPT proteins were analyzed by High Pressure Liquid 
Chromatography (HPLC) for altered levels of total tocopherols, as well as altered 
levels of specific tocopherols (alpha, beta, gamma, and delta tocopherol). 

Extracts of leaves and seeds were prepared for HPLC as follows. For seed 
extracts, 10 mg of seed was added to 1 g of microbeads (Biospec) in a sterile 
microfiige tube to which 500 ul 1% pyrogallol (Sigma Chem)/ethanol was added. 
The mixture was shaken for 3 minutes in a mini Beadbeater (Biospec) on "fast" speed. 



Enzyme source 



Enzyme activity 
[pmol/mg x h] 



slrl736 expressed in SF9 cells 
ATPT2 expressed in SF9 cells 
SF9 cell control 
Synechocystis 6803 
Spinach chloroplasts 



20 
6 

<0.05 

0.25 

0.20 
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The extract was filtered through a 0.2 urn filter into an autosampler tube. The filtered 
extracts were then used in HPLC analysis described below. 

Leaf extracts were prepared by mixing 30-50 mg of leaf tissue with 1 g 
microbeads and freezing in liquid nitrogen until extraction. For extraction, 500 ul 1% 
5 pyrogaUol in ethanol was added to the leaf/bead mixture and shaken for 1 minute on a 
Beadbeater (Biospec) on "fast" speed. The resulting mixture was centrifuged for 4 
minutes at 14,000 rpm and filtered as described above prior to HPLC analysis. 

HPLC was performed on a Zorbax silica HPLC column (4.6 mm X 250 mm) 
with a fluorescent detection, an excitation at 290 nm, an emission at 336 nm, and 
10 bandpass and slits. Solvent A was hexane and solvent B was methyl-t-butyl ether. 
The injection volume was 20 ul, the flow rate was 1 .5 ml/min, the run time was 12 
min (40°C) using the gradient (Table 6): 

Table 6: 



15 



Time 


Solvent A 


Solvent B 


0 min. 


90% 


10% 


10 min. 


90% 


10% 


11 min. 


25% 


75% 


12 min. 


90% 


10% 



20 

Tocopherol standards in 1% pyrogallol/ ethanol were also run for comparison 
(alpha tocopherol, gamma tocopherol, beta tocopherol, delta tocopherol, and 

* 

tocopherol (tocol) (all from Matreya). 

Standard curves for alpha, beta, delta, and gamma tocopherol were calculated 
2 5 using Chemstation software. The absolute amount of component x is: Absolute 

amount of x= Response x x RF X x dilution factor where Response x is the area of peak x, 
RF X is the response factor for component x (Amount/Response J and the dilution 
factor is 500 ul. The ng/mg tissue is found by: total ng component/mg plant tissue. 
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Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines 
containing pMON10822 for the expression of ATPT2 from the napin promoter are 
provided in Figure 24. 

HPLC analysis results of segregating T2 Arabidopsis seed tissue expressing 
5 the ATPT2 sequence from the napin promoter (pCGN 10822) demonstrates an 
increased level of tocopherols in the seed. Total tocopherol levels are increased as 
much as 50% over the total tocopherol levels of non-transformed (wild-type) 
Arabidopsis plants (Figure 25). Homo2ygous progeny from the top 3 lines (T3 seed) 
hav« up to a two-fold (100%) increase in total tocopherol levels over control 
10 Arabidopsis seed ( Figure 26.) 

Furthermore, increases of particular tocopherols are also increased in 
transgenic Arabidopsis plants expressing the ATPT2 nucleic acid sequence from the 
napin promoter. Levels of delta tocopherol in these lines are increased greater than 3 
fold over the delta tocopherol levels obtained from the seeds of wild type Arabidopsis 
1 5 lines. Levels of gamma tocopherol in transgenic Arabidopsis lines expressing the 
ATPT2 nucleic acid sequence are increased as much as about 60% over the levels 
obtained in the seeds of non-transgenic control lines. Furthermore, levels of alpha 
tocopherol are increased as much as 3 fold over those obtained from non-transgenic 
control lines. 

2 0 Results of the HPLC analysis of seed extracts of transgenic Arabidopsis lines 

containing pCGN10803 for the expression of ATPT2 from the enhanced 35S 
promoter (antisense orientation ) are provided in Figure 25. Two lines were identified 
that have reduced total tocopherols, up to a ten-fold decrease observed in T3 seed 
compared to control Arabidopsis (Figure 27.) 

25 

5B. Canola 

Brassica napus, variety SP30021, was transformed with pCGN10822 (napin- 
ATPT2-napin 3 \ sense orientation) using Agrobacterium tumefaciens-medisted 
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transformation. Flowers of the RO plants were tagged upon pollination and 
developing seed was collected at 35 and 45 days after pollination (DAP). 

Developing seed was assayed for tocopherol levels, as described above for 
Arabidopsis. Line 10822-1 shows a 20% increase of total tocopherols, compared to 
5 the wild-type control, at 45 DAP. Figure 28 shows total tocopherol levels measured 
in developing canola seed. 

Example 6: Sequences to Tocopherol Cyclase 
6A. Preparation of the sir 1 737 Knockout 

1 0 The Synechocystis sp. 6803 sir 1 73 7 knockout was constructed by the 

following method. The GPS™-1 Genome Priming System (New England Biolabs) 
was used to insert, by a Tn7 Transposase system, a Kanamycin resistance cassette 
into sir 17 37. A plasmid from a Synechocystis genomic library clone containing 652 
base pairs of the targeted orf (Synechcocystis genome base pairs 1324051 - 1324703; 

15 the predicted orf base pairs 1 323 672 - 1 324763 , as annotated by Cyanobase) was used 
as target DNA. The reaction was performed according to the manufacturers protocol. 
The reaction mixture was then transformed into £. colt DH10B electrocompetaht cells 
and plated. Colonies from this transformation were then screened for transposon 
insertions into the target sequence by amplifying with Ml 3 Forward and Reverse 

2 0 Universal primers, yielding a product of 652 base pairs plus -1700 base pairs, the size 
of the transposon kanamycin cassette, for a total fragment size of -2300 base pairs. 
After this determination, it was then necessary to determine the approximate location 
of the insertion within the targeted orf, as 100 base pairs of orf sequence was 
estimated as necessary for efficient homologous recombination in Synechocystis. This 

2 5 was accomplished through amplification reactions using either of the primers to the 
ends of the transposon, Primer S (5' end) or N (3' end), in combination with either a 
Ml 3 Forward or Reverse primer. That is, four different primer combinations were 
used to map each potential knockout construct: Primer S - Ml 3 Forward, Primer S - 
M13 Reverse, Primer N - M13 Forward, Primer N - M13 Reverse. The construct 
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used to transform Synechocystis and knockout sir 1737 was determined to consist of a 
approximately 150 base pairs of slrl737 sequence on the 5' side of the transposon 
insertion and approximately 500 base pairs on the 3' side, with the transcription of the 
orf and kanamycin cassette in the same directioa The nucleic acid sequence of 
5 slrl 737 is provided in SEQ ID NO:3 8 the deduced amino acid sequence is provided in 
SEQ ID NO:39. 

Cells of Synechocystis 6803 were grown to a density of - 2x1 0 8 cells per ml 
and harvested by centrifugation. The cell pellet was re-suspended in fresh BG-1 1 
medium at a density of 1x10' cells per ml and used immediately for transformation. 
10 1 00 ul of these cells were mixed with 5 ul of mini prep DNA and incubated with light 
at 30C for 4 hours. This mixture was then plated onto nylon filters resting on BG-1 1 
agar supplemented with TES ph8 and allowed to grow for 12-18 hours. The filters 
were then transferred to BG-1 1 agar + TES + 5ug/ml kanamycin and allowed to grow 
until colonies appeared within 7-10 days (Packer and Glazer, 1988). Colonies were 
15 then picked into BG-1 1 liquid media containing 5 ug/ml kanamycin and allowed to 
grow for 5 days. These cells were then transferred to Bg-1 1 media containing 
lOug/ml kanamycin and allowed to grow for 5 days and then transferred to Bg-1 1 + 
kanamycin at 25ug/ml and allowed to grow for 5 days. Cells were then harvested for 
PCR analysis to determine the presence of a disrupted ORF and also for HPLC 
2 0 analysis to determine if the disruption had any effect on tocopherol levels. 

PCR analysis of the Synechocystis isolates, using primers to the ends of the 
slrl737 orf, showed complete segregation of the mutant genome, meaning no copies 
of the wild type genome could be detected in these strains. This suggests that 
function of the native gene is not essential for cell function. HPLC analysis of the 
25 strain carrying the knockout for slrl 73 7 produced no detectable levels of tocopherol. 



6B. The relation of slrl 737 and sir 173 6 

The slrl737 gene occurs in Synechocystis downstream and in the same 
orientation as ski 73 6, the phytyl prenyltransferase. In bacteria this proximity often 
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indicates an operon structure and therefore an expression pattern that is linked in all 
genes belonging to this operon. Occasionally such operons contain several genes that 
are required to constitute one enzyme. To confirm that slrl737 is not required for 
phytyl prenyltransferase activity, phytyl prenyltransferase was measured in extracts 
5 from the Synechocystis slrl737 knockout mutant Figure 29 shows that extracts from 
the Synechocystis slrl737 knockout mutant still contain phytyl prenyltransferase 
activity. The molecular organization of genes in Synechocystis 6803 is shown in A. 
Figures B and C show HPLC traces (normal phase HPLC) of reaction products 
obtained with membrane preparations from Synechocystis wild type and slrl737" 
10 membrane preparations, respectively. 

The fact that slrl 737 is not required for the PPT activity provides additional 
data that ATPT2 and slrl 736 encode phytyl prenyltransferases. 

6C Synechocystis Knockouts 

15 Synechocystis 6803 wild type and Synechocystis slrl737 knockout mutant 

were grown photoautotrophically. Cells from a 20 ml culture of the late logarithmic 
growth phase were harvested and extracted with ethanol. Extracts were separated by 
isocratic normal-phase HPLC using a Hexane/Methyl-t-butyl ether (95/5) and a 
Zorbax silica column, 4.6 x 250 mm. Tocopherols and tocopherol intermediates were 

2 0 detected by fluorescence (excitement 290 nm, emission 336 nm) (Figure 30). 

_ * 

Extracts of Synechocystis 6803 contained a clear signal of alpha-tocopherol. 
2,3-Dimethyl-5-phytylplastoquinol was below the limit of detection in extracts from 
the Synechocystis wild type (C). In contrast, extracts from the Synechocystis slrl737 
knockout mutant did not contain alpha-tocopherol, but contained 2,3-dimethyl-5- 
25. phytylplastoquinol (D), indicating that the interruption of slrl737 has resulted in a 
block of the 2,3-dimethyl-5-phytylplastoquinol cyclase reaction. 

Chromatograms of standard compounds alpha, beta, gamma, delta-tocopherol 
and 2,3-dimethyl-5-phytylplastoquinol are shown in A and B. Chromatograms of 
extracts form Synechocystis wild type and the Synechocystis slrl737 knockout mutant 
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are shown in C and D, respectively. Abbreviations: 2,3-DMPQ, 2,3-dimethyl-5- 
phytylplastoquinol. 

r 

6D. Incubation with Lysozyme treated Synechocystis 
5 Synechocystis 6803 wild type and sir 1737 knockout mutant cells from the late 

logarithmic growth phase (approximately 1 g wet cells per experiment in a total 
volume of 3 ml) were treated with Lysozyme and subsequently incubated with 
S-adenosylmethionine, and phytylpyrophosphate, plus radiolabelled homogentisic 
acid. After 1 7h incubation in the dark at room temperature the samples were extracted 

10 with 6 ml chloroform / methanol (1/2 v/v). Phase separation was obtained by the 

addition of 6 ml 0.9% NaCl solution. This procedure was repeated three times. Under 
these conditions 2,3-dimethyl-5-phytylplastoquinol is oxidized to form 2,3-dimethyl- 
5-phytylplastoquinone. 

The extracts were analyzed by normal phase and reverse phase HPLC. Using 

15 extracts from wild type Synechocystis cells radiolabelled gamma-tocopherol and 
traces of radiolabelled 2,3-dimethyl-5-phytylplastoquinone were detected. When 
extracts from the slrl737 knockout mutant were analyzed, only radiolabelled 2,3- 
dimethyl-5-phytylplastoquinone was detectable. The amount of 2,3-dimethyl-5- 

* 

phytylplastoquinone was significantly increased compared to wild type extracts. Heat 
2 0 treated samples of the wild type and the sir 1 737 knockout mutant did not produce 
radiolabelled 2 3 3-dimethyl-5-phytyIplastoquinone > nor radiolabelled tocopherols. 
These results further support the role of the slrl737 expression product in the 
cyclization of 2,3-dimethyl-5-phytylplastoquinoL 

25 6E. Arabidopsis Komologxt to $kl737 

An Arabidopsis homologue to slrl737 was identified from a BLASTALL 
search using Synechocystis sp 6803 gene slrl737 as the query, in both public and 
proprietary databases. SEQ ID NO:109 and SEQ ED NO:l 10 are the DNA and 
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translated amino acid sequences, respectively, of the Arabidopsis homologue to 
ski 737. The start if found at the ATG at base 56 in SEQ ID NO:l 09. 

The sequences obtained for the homologue from the proprietary database 
differs from the public database (F4D1 L30, BAC AL022537), in having a start site 
5 471 base paks upstream of the start identified in the public sequence. A comparison 
of the public and proprietary sequences is provided in Figure 3 1 . The correct start 
correlates within the public database sequence is at 12080, while the public sequence 
start is given as being at 1 1609. 

Attempts to amplify a ski 737 homologue were unsuccessful using primers 
10 designed from the public database, while amplification of the gene was accomplished 
with primers obtained from SEQ ID NO: 109. 

Analysis of the protein sequence to identify transit peptide sequence predicted 
two potential cleavage sites, one between amino acids 48 and 49, and the other 
between amino acids 98 and 99. 

15 

6F. ski 737 Protein Information 

The ski 737 orf comprises 363 amino acid residues and has a predicted MW of 
41kDa (SEQ ID NO: 39). Hydropathic analysis indicates the protein is hydrophillic 
(Figure 32). 

2 0 The Arabidopsis homologue to ski 737 (SEQ ID xx) comprises 488 amino 

acid residues, has a predicted MW of 55kDa, and a has a putative transit peptide 
sequence comprising the first 98 amino acids. The predicted MW of the mature form 
of the Arabidopsis homologue is 44kDa. The hydropathic plot for the Arabidopsis 
homologue also reveals that it is hydrophillic (Figure 33). Further blast analysis of 

25 the Arabidopsis homologue reveals limited sequence identity (25 % sequence 

identity) with the beta-subunit of respiratory nitrate reductase. Based on the sequence 
identity to nitrate reductase, it suggests the ski 737 orf is an enzyme that likely 
involves general acid catalysis mechanism. 
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Investigation of known enzymes involved in tocopherol metabolism indicated 
that the best candidate corresponding to the general acid mechanism is the tocopherol 
cyclase. There are many known examples of cyclases including, tocopherol cyclase, 
chalcone isomerase, lycopene cyclase, and aristolochene synthase. By further 
5 examination of the microscopic catalytic mechanism of phytoplastoquinol cyclization, 
as an example, chalcone isomerase has a catalytic mechanism most similar to 
tocopherol cyclase. (Figure 34). 

Multiple sequence alignment was performed between slrl737, slr!737 
Arabidopsis homologue and the Arabidopsis chalcone isomerase (Genbank:P41088) 
1 0 (Figure 35). 65% of the conserved residues among the three enzymes are strictly 
conserved within the known chalcone isomerases. The crystal structure of alfalfa 
chalcone isomerase has been solved (Jez, Joseph M., Bowman, Marianne E., Dixon, 
Richard A., and Noel, Joseph P. (2000) "Structure and mechanism of the 
evolutionarily unique plant enzyme chalcone isomerase". Nature Structural Biology 
15 7: 786-791.) It has been demonstrated tyrosine (Y) 106 of the alfalfa chalcone 

isomerase serves as the general acid during cyclization reaction (Genbank: P28012). 
The equivalent residue in slrl737 and the skim Arabidopsis homolog is lysine (K), 
which is an excellent catalytic residue as general acid. 

The information available from partial purification of tocopherol cyclase from 
20 Chlorellaprotothecoides (U.S. Patent No. 5,432,069), Le., described as being glycine 
rich, water soluble and with a predicted MW of 48-50kDa, is consistent with the 
protein informatics information obtained for the sir 1737 and the Arabidopsis sir 1737 
homologue. 

All publications and patent applications mentioned in this specification are 
2 5 indicative of the level of skill of those skilled in the art to which this invention 

pertains. All publications and patent applications are herein incorporated by reference 
to the same extent as if each individual publication or patent application was 
specifically and individually indicated to be incorporated by reference. 
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Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be obvious 
that certain changes and modifications may be practiced within the scope of the 
appended claim. 
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CLAIMS 

What is claimed is: 

1 . An isolated nucleic acid sequence encoding a tocopherol cyclase. 
5 2. An isolated nucleic acid sequence according to Claim 1, wherein said tocopherol 
cyclase is active in the cyclization of 2,3-dimethyl-5-phytylplastoquinol to tocopherol. 
3. An isolated nucleic acid sequence according to Claim 1, wherein said tocopherol 
cyclase is active in the cyclization of 23-dimethyl-5-geranylgeranylplastoquinol to 
tocotrienol. 

10 4. An isolated DNA sequence according to Claim 1 , wherein said nucleic acid sequence is 
isolated from a eukaryotic cell source. 

5. An isolated DNA sequence according to Claim 4, wherein said eukaryotic cell source is 
selected from the group consisting of mammalian, nematode, fungal, and plant cells. 

6. The DNA encoding sequence of Claim 5 wherein said tocopherol cyclase protein is from 
15 Arabidopsis. 

7. The DNA encoding sequence of Claim 6 wherein said tocopherol cyclase protein is 
encoded by a sequence of SEQ ID NO: 109. 

8. The DNA encoding sequence of Claim 7 wherein said tocopherol cyclase protein has an 
amino acid sequence of SEQ ID NO: 1 10. 

20 9. The DNA encoding sequence of Claim 4 wherein said tocopherol cyclase protein is from a 
source selected from the group consisting of Arabidopsis, soybean, com, rice, wheat, leek 
canola, , leek, cotton, and tomato. 

10. An isolated DNA sequence according to Claim 4, wherein said prokaiyotic source is a 
Synechocystis sp. 

25 1 1 . The DNA encoding sequence of Claim 1 0 wherein said tocopherol cyclase protein is 
encoded by a sequence of SEQ ID NO:38. 

12. The DNA encoding sequence of Claim 10 wherein said tocopherol cyclase protein has an 
amino acid sequence of SEQ ID NO:39. 
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13. A nucleic acid construct comprising as operably linked components, a transcriptional 
initiation region functional in a host cell, a nucleic acid sequence encoding a tocopherol 
cyclase, and a transcriptional termination region. 

14. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from an organism selected from the group consisting 
of a eukaiyotic organism and a prokaryotic organism. 

15. A nucleic acid construct according to Claim 14, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from a plant source. 

16. A nucleic acid construct according to Claim 15, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from a source selected from the group consisting of 
Arabidopsis, soybean, corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

17. A nucleic acid construct according to Claim 13, wherein said nucleic acid sequence 
encoding tocopherol cyclase is obtained from a Synechocystis sp. 

18. A plant cell comprising the construct of 13. 
15 19. A plant comprising a cell of Claim 18. 

20 A feed composition produced from a plant according to Claim 19. 
21. A seed comprising a cell of Claim 18. 
22 Oil obtained from a seed of Claim 21 . 

23. A natural tocopherol rich refined and deodorised oil which has been produced by 
a method of treating an oil according to Claim 22 by distilling under low pressure and 
high temperature, wherein said refined oil has reduced free fatty acids and a 
substantial percentage of tocopherol present in the pretreated oil. 

24. A refined oil according to claim 23, wherein the pretreated oil is crude or pre- 
treated soybean oil. 

25 25. A refined oil according to claim 23, wherein the refined oil is degummed and 
bleached. 

26. A method for the alteration of the isoprenoid content in a host cell, said method 
comprising; transforming said host cell with a construct comprising as operably linked 



20 
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components, a transcriptional initiation region functional in a host cell, a nucleic acid 

sequence encoding tocopherol cyclase, and a transcriptional termination region, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols . 

27. The method according to Claim 26, wherein said host cell is selected from the group 
5 consisting of a prokaryotic cell and a eukaryotic cell. 

28. The method according to Claim 27, wherein said prokaryotic cell is a Synechocystis sp. 

29. The method according to Claim 27, wherein said eukaryotic cell is a plant cell. 

30. The method according to Claim 29, wherein said plant cell is obtained from a plant 
selected from the group consisting of Arabidopsis, soybean, corn, rice, wheat, leek canola, , 

10 leek, cotton, and tomato. 

3 1 . A method for producing an isoprenoid compound of interest in a host cell, said method 
comprising obtaining a transformed host cell, said host cell having and expressing in its 
genome: 

a construct having a DNA sequence encoding a tocopherol cyclase operably linked to a 
1 5 transcriptional initiation region functional in a host cell, 

wherein said isoprenoid compound selected from the group of tocopherols and tocotrienols. 

32. The method according to Claim 31, wherein said host cell is selected from the group 
consisting of a prokaryotic cell and a eukaryotic cell. 

33. The method according to Claim 32, wherein said prokaryotic cell is a Synechocystis sp. 
2 0 34. The method according to Claim 32, wherein said eukaryotic cell is a plant cell. 

35. The method according to Claim 34, wherein said plant cell is obtained from a plant 
selected from the group consisting wherein said compound selected from the group of 
Arabidopsis, soybean, corn, rice, wheat, leek canola, , leek, cotton, and tomato. 

36. A method for increasing the biosynthetic flux in a host cell toward production of 
25 an isoprenoid compound, said method comprising; transforming said host cell with a 

construct comprising as operably linked components, a transcriptional initiation 
region functional in a host cell, a DNA encoding a tocopherol cyclase, and a 
transcriptional termination region, wherein said isoprenoid compound selected from 
the group of tocopherols and tocotrienols,. 
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37. The method according to Claim 36, wherein said host cell is selected from the group 
consisting of aprokaryotic cell and a eukaiyotic cell. 

38. The method according to Claim 37, wherein said prokaryotic cell is a Synechocystis sp. 

39. The method according to Claim 37, wherein said eukaryotic cell is a plant cell. 

40. The method according to Claim 39, wherein said plant cell is obtained from a plant 
selected from the group consisting Arabidopsis, soybean, corn, rice, wheat, leek canola, , 
leek, cotton, and tomato. 

41. The method according to Claim 39, wherein said transcriptional initiation region is a 
seed-specific promoter. 
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Figure 2 
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Figure 3 
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Figure 5 



WO 01/79472 



PCT/US01/12334 



6/40 



EcoRI 5107 




Figure 6 



WO 01/79472 



PCT/US01/12334 



7/40 



BamHI 1 
Notl 8 



Hindlll 352 
Hindlll 402 
Dral 503 



Psti 673 



Dral 4305 
Dral 4286 



Pvul 3803 



Dral 3594 




Ncol 3133 



Pvul I 2807 
Pstl 2754 



Bglll 2537 



Pstl 845 

Hindlll 1002 
Bglll 1067 
Pvul I 1137 
Pstl 1164 

Pstl 1247 
Pvull 1308 



Hindlll 1529 
Kpnl 1539 
Sad 1545 
BamHI 1547 
EcoRI 1578 
Pstl 1587 
EcoRV 1590 
Notl 1605 
'Xhol 1611 
'Xbal 1623 
Pvull 1753 
Pvul 1784 



Figure 7 



WO 01/79472 PCT/US01/12334 



8/40 



Pvull 121 
EcoRI 296 
Sad 306 
Ncol 346 
Stul 486 
Ncol 736 

EcoRV 1444 



EcoRV 17000 v 
Bglll 16804 V 



Sacl 15251 
EcoRV 15153 
Clal 15139 
Pvul 15139 

Pvul 14567 

Xhol 13924 

Pvul 13493 
Dral 13338 
Sacl 13219 
Pvull 13209 
Bglll 12937 
Xhol 12326 
Pvul 11818 
Pvull 11683 
Sacl 11571 
Bglll 11548 
Smal 11533 
EcoRI 11203 
EcoRI 11055 
Pvul 10915 
Pvull 10666 
Ncol 10572 
EcoRV 10516 

Sail 9879 
Pvul 9844 
Xhol 9814 
Bglll 9758 



BamHI 1659 
Notl 1666 
Hindlll 2010 
Hindlll 2060 
Dral 2161 
Pstl 2331 
Pstl 2503 
I Hindlll 2660 
| Bglll 2725 
I Pvull 2795 
[Pstl 2822 
-Pstl 2905 
Sacl 2911 




Pvull 4158 
Pvul 4189 
Xhol 4238 
Pvull 4474 

EcoRV 4769 
Sail 4979 
EcoRI 4991 
Pvull 5246 
Ncol 5572 
EcoRI 5981 
Smal 5988 



Xhol 
Pvul 
Dral 7592 
Pvul 7676 
Xhol 8121 
Smal 8285 
EcoRV 8316 
EcoRV 8488 
EcoRV 8778 



7130 
7140 



Pvul 9428 
EcoRV 9627 



Figure 8 



WO 01/79472 PCT/US01/12334 



9/40 



Pvull 121 
lEcoRI 296 
Sad 306 
Sad 645 

EcoRV 1549 



EcoRV 17530 \ 
Bgll! 17334 



Xbal 1578 
EcoRI 1773 
Sail 2075 
Pstl 2085 
Pstl 2168 
Pvull 2191 
Bgll I 2257 
Hindlll 2322 
Pstl 2487 
Pstl 2659 
Dral 2825 
Hindlll 2922 
Hindlll 2972 
Notl 3316 



Saci 15781 
EcoRV 15683 
Clal 15669 
Pvul 15669 

Pvul 15097 — £ 




Xhol 14454 

Pvul 14023 
Dral 13868 
Sad 13749 
Pvuil 13739 
Bgll I 13467 
Xhol 12856 
Pvul 12348 
Pvull 12213 
Sad 12101 
Bglll 12078 
Smal 12063 
EcoRI 11733 
EcoRI 11585 
Pvul 11445 
Pvull 11196 
Ncol 11102 
EcoRV 11046 

Sal! 10409 
Pvul 10374 
Xhol 10344 
Bgll! 10288 
EcoRV 10157 
Pvul 9958 



BamHI 3323 
Xhol 3328 
BamHI 3645 
EcoRV 4489 
Pvull 4688 
Pvul 4719 
Xhol 4768 
Pvull 5004 
EcoRV 5299 
Sail 5509 
EcoRI 5521 
Pvull 5776 
Ncol 6102 
EcoRI 6511 
Smal 6518 



Xhol 7660 
Pvul 7670 
Dral 8122 
Pvul 8206 
Xhol 8651 
Smal 8815 
EcoRV 8846 
EcoRV 9018 
EcoRV 9308 



Figure 9 



W0 01/79472 PCT/US01/12334 



10/40 



Pvull 121 
EcoRI 296 
Sad 306 
Saci 645 

EcoRV 1549 
Xbal 1578 
lEcoRI 1773 




EcoRV 17530 
Bglll 17334 

Sad 15781 
EcoRV 15683 
Clal 15669 
Pvul 15669 

Pvul 15097 

Xhol 14454 

Pvul 14023 
Dral 13868 
Sacl 13749 
Pvull 13739 
Bglll 13467 
Xhol 12856 
Pvul 12348 
Pvull 12213 
Sacl 12101 
Bglll 12078 
Smal 12063 
EcoRI 11733 
EcoRl 11585 
Pvul 11445 1 
Pvull 11196 
Ncol 11102 
EcoRV 11046 

Sail 10409 
Pvul 10374 
Xhol 10344 
Bglfl 10288 
EcoRV 10157 
Pvul 9958 



BamHI 2080 
Notl 2087 
Hindi!! 2431 
Hindlll 2481 
Drai 2582 
Pstl 2752 
Pstl 2924 
Hindlll 3081 
Bglll 3146 
Pvull 3216 
Pstl 3243 
Pstl 3326 
BamHI 3645 
EcoRV 4489 
Pvull 4688 
Pvul 4719 
Xhol 4768 
Pvull 5004 
EcoRV 5299 
Sail 5509 
EcoRI 5521 
Pvull 5776 
Ncol 6102 
EcoRI 6511 
Smal 6518 



Xhol 7660 
Pvul 7670 
Dral 8122 
Pvul 8206 
Xhol 8651 
Smal 8815 
EcoRV 8846 
EcoRV 9018 
EcoRV 9308 



Figure 10 



WO 01/79472 



PCT/US01/12334 



11/40 



Pvull 121 
EcoRI 296 
Sacl 306 
Ncol 346 
Stul 486 
Ncol 736 

EcoRV 1444 



EcoRV 17014 
Bglll 16818 

Sacl 15265 i 
EcoRV 15167 il 
Clal 15153 
Pvul 15153 

Pvui 14581 

Xhol 13938 

Pvul 13507 
Dral 13352 
Sacl 13233 
Pvull 13223 
Bglll 12951 
Xhol 12340 
Pvul 11832 
Pvull 11697 
Sacl 11585 
Bglll 11562 
Smal 11547 
EcoRI 11217 
EcoRI 11069 
Pvul 10929 
Pvull 10680 
Ncol 10586 
EcoRV 10530 

Sail 9893 
Pvul 9858 
Xhol 9828 
Bglll 9772 



Pstl 1670 
Pstl 1753 
Pvull 1776 
liBglll 1842 
/(Hlndll! 1907 
Pstl 2072 
Pstl 2244 
Dral 2410 
Hindlll 2507 
Hindlll 2557 
Notl 2901 
BamHI 2908 
Sacl 2925 




Pvull 4172 
Pvul 4203 
Xhol 4252 
Pvull 4488 

EcoRV 4783 
Sail 4993 
j 0 EcoRI 5005 
— Tvull 5260 
Ncol 5586 
EcoRI 5995 
Smal 6002 



Xhol 7144 
Pvul 7154 
Dral 7606 
Pvul 7690 
Xhol 8135 
Smai 8299 
EcoRV 8330 
EcoRV 8502 
EcoRV 8792 
Pvul 9442 
EcoRV 9641 



Figure 11 



WO 01/79472 



PCT/US01/12334 



12/40 



Pvull 121 
EcoRI 296 
Sad 306 
Nco! 346 
Stul 486 
Ncol 736 

EcoRV 1444 
BamhU 1659 




EcoRV 17072 
Bglll 16876 



Sad 15323 
EcoRV 15225 
Clal 15211 
Pvul 15211 

Pvul 14639 ■ — E 

Xhol 13996 

Pvul 13565 
Dra! 13410 
Sad 13291 
Pvull 13281 
Bgfd 13009 
Xhol 12398 
Pvul 11890 
Pvull 11755 
Sad 11643 
Bglll 11620 
Sma( 11605 
EcoRI 11275 
EcoRI 11127 
Pvul 10987 
Pvull 10738 
Ncol 10644 
EcoRV 10588 

Sail 9951 
Pvul 9916 
Xhol 9886 
Bglll 9830 
EcoRV 9699 



2450 
2519 
2977 
2983 



Pvull 4230 
Pvul 4261 
Xhol 4310 
Pvull 4546 
EcoRV 4841 
iptSall 5051 
°^^EcoR! 5063 
Pvull 5318 
Ncol 5644 
EcoRI 6053 
Smal 6060 



EcoRV 
EcoRV 8560 
EcoRV 8850 
Pvul 9500 



Figure 12 



WO 01/79472 PCT/US01/12334 



13/40 



Pvull 121 
i EcoRI 296 
Sad 306 
Sad 645 

EcoRV 1549 



EcoRV 17602 
Bglll 17406 -\ 

Sad 15853 
EcoRV 15755 
Clal 15741 
Pvul 15741 

Pvul 15169 

Xhol 14526 

Pvul 14095 
Dral 13940 
Sad 13821 
Pvull 13811 
BgiU 13539 
Xhol 12928 
Pvul 12420 
Pvull 12285 
Sad 12173 
Bglll 12150 
Smal 12135 
EcoRI 11805 
EcoRI 11657 
Pvul 11517 
Pvull 11268 
Ncol 11174 
EcoRV 11118 1 

Sail 10481 
Pvul 10446 
Xhol 10416 
Bglll 10360 
EcoRV 10229 
Pvul 10030 



Xbal 1578 
EcoRI 1773 
Sail 2075 
Pstl 2085 
Pvull 2539 
Pvull 2608 
Smal 2994 
Xbal 3135 




Bglll 3210 
Bglll 3369 
Notl 3388 
BamHI 3395 
Xhol 3400 
BamHI 3717 

EcoRV 4561 
{ jj Pvull 4760 
-iJPvul 4791 
Xhol 4840 
Pvull 5076 
EcoRV 5371 
SaJl 5581 
EcoRI 5593 
Pvull 5848 
Ncol 6174 
EcoRI 6583 
Smal 6590 



Xhol 7732 
Pvul 7742 
Dral 8194 
Pvul 8278 
Xhol 8723 
Smal 8887 
EcoRV 8918 
EcoRV 9090 
EcoRV 9380 



Figure 13 



WO 01/79472 



PCT/US01/12334 



14/40 



Pvull 121 
EcoRI 296 
Sad 306 
Sad 645 

EcoRV 1549 



EcoRV 17602 
Bgill 17406 -\ 

Sad 15853 
EcoRV 15755 
Clal 15741 
Pvul 15741 

Pvul 15169 

Xhol 14526 

Pvul 14095 
Dral 13940 
Sad 13821 
Pvull 13811 
Bgill 13539 
Xhol 12928 
Pvul 12420 
PvulJ 12285 
Sad 12173 
Bgill 12150 
Smal 12135 
EcoRI 11805 
EcoRI 11657 
Pvul 11517 
Pvull 11268 
Ncol 11174 
EcoRV 11118 

Sail 10481 
Pvul 10446 
Xhol 10416 
Bgill 10360 
EcoRV 10229 
Pvul 10030 



Xbal 1578 
EcoRI 1773 
BamHI 2080 
Notl 2087 
Bgill 2106 
Bgill 2265 
Xbal 2340 
Smal 2485 
Pvull 2871 
Pvull 2940 




Pstl 3398 
BamHI 3717 

EcoRV 4561 
Pvull 4760 
Pvul 4791 
Xhol 4840 
Pvull 5076 
EcoRV 5371 
Sail 5581 
EcoRI 5593 
Pvull 5848 
Ncol 6174 
EcoRI 6583 
Smal 6590 



Xhol 7732 
Pvul 7742 
Oral 8194 
Pvul 8278 
Xhol 8723 
Smal 8887 
EcoRV 8918 
EcoRV 9090 
EcoRV 9380 



Figure 14 



WO 01/79472 



PCT/US01/12334 



15/40 



Pvull 121 
EcoRI 296 
Sad 306 
Nco! 346 
StuI 486 
Ncol 736 

EcoRV 1444 



EcoRV 17086 
Bgill 16890 

Sacl 15337 
EcoRV 15239 
Clal 15225 
Pvul 15225 

Pvul 14653 

Xhol 14010 

Pvul 13579 
Oral 13424 
Sacl 13305 
Pvull 13295 
Bgill 13023 
Xhol 12412 
Pvul 11904 
Pvull 11769 
Sacl 11657 
Bgill 11634 
Smal 11619 
EcoRI 11289 
EcoRI 11141 
Pvul 11001 
Pvull 10752 
Ncol 10658 
EcoRV 10602 

Sail 9965 
Pvul 9930 
Xhol 9900 
Bgllf 9844 
EcoRV 9713 



Pstl 1670 
| Pvul I 2124 
Pvull 2193 
i Smal 2579 
JlXbal 2720 
Bgill 2795 
Bglil 2954 
Notl 2973 
BamHI 2980 
Sacl 2997 




Pvull 4244 
Pvul 4275 
Xhol 4324 
Pvull 4560 

EcoRV 4855 
Sail 5065 
EcoRI 5077 
Pvull 5332 
Ncol 5658 
EcoRI 6067 
Smal 6074 



Xhol 7216 
Pvul 7226 
Drat 7678 
Pvu! 7762 
Xhol 8207 
Smal 8371 
EcoRV 8402 
EcoRV 8574 
EcoRV 8864 



Figure 15 



WO 01/79472 



PCT/US01/12334 



16/40 

* 



EcoRI 5107 | 
BamHI 5076 i 1 




Figure 16 



WO 01/79472 PCT7US01/12334 



17/40 



Pvuil 121 
EcoRI 296 
Sacl 306 
Sad 645 



EcoRV 17272 
Bglll 17076^" 



Sacl 15523 
EcoRV 15425 
Clai 15411 
Pvul 15411 

Pvul 14839 

Xhol 14196 

Pvul 13765 
Dral 13610 
Saci 13491 
Pvull 13481 
Bgfll 13209 
Xhol 12598 
Pvul 12090 
Pvull 11955 
Sacl 11843 
Bglll 11820 
Smal 11805 
EcoRI 11475 
EcoRI 11327 
Pvul 11187 
Pvull 10938 
Ncol 10844 
EcoRV 10788 

Sail 10151 
Pvul 10116 
Xhol 10086 
Bglll 10030 
EcoRV 9899 



EcoRV 1549 
Xbal 1578 
EcoRI 1773 
BamHI 2080 
Notl 2087 
Hindlll 2118 
Smal 2436 
Dral 2461 
Ncol 2524 
Pvull 2617 
Ncol 2780 
Ncoi 2812 
Ncol 2935 
Pvuil 2961 
Pstl 3068 




BamHI 3387 
EcoRV 4231 
Pvull 4430 
Pvul 4461 
Xhol 4510 
Pvull 4746 

EcoRV 5041 
Sail 5251 
EcoRI 5263 
Pvull 5518 
Ncol 5844 
EcoRI 6253 
Smal 6260 



Xhol 7402 
Pvul 7412 
Dral 7864 
Pvul 7948 
Xhol 8393 
Smal 8557 
EcoRV 8588 
EcoRV 8760 
EcoRV 9050 



Pvul 9700 



Figure 17 



WO 01/79472 



PCT/US01/12334 



18/40 



Pvull 121 
EcoRI 296 
Sad 306 
|Ncol 346 
Stui 486 
Ncol 736 

EcoRV 1444 




EcoRV 16940 
Bglll 16744 

Sad 15191 
EcoRV 15093 
Clal 15079 
Pvul 15079 

Pvul 14507 

Xho! 13864 

Pvul 13433 
Dral 13278 
Sacl 13159 
Pvull 13149 
Bglll 12877 
Xhol 12266 
Pvul 11758 
Pvull 11623 
Sacl 11511 
Bglll 11468 
Smal 11473 
EcoRI 11143 
EcoRI 10995 
Pvul 10855 
Pvull 10606 
Ncol 10512 
EcoRV 10456 

Sail 9819 
Pvul 9784 
Xhol 9754 
Bglll 9698 



BamHI 1659 
Notl 1666 
Pvul 1735 
EcoRI 1759 
Smal 1825 
EcoRV 2158 
Ncol 2426 
EcoRV 2654 
Dral 2731 
Pstl 2845 
Sacl 2851 



Pvull 4098 
Pvul 4129 
Xhol 4178 
Pvull 4414 

35S promoter * fog"™ 

B-^EcoRi 4931 
Pvull 5186 
Ncol 5512 
EcoRI 5921 
Smal 5928 

Xhol 7070 
Pvul 7080 
Dral 7532 
Pvul 7616 
Xhol 8061 
Smal 8225 
EcoRV 8256 
EcoRV 8428 
EcoRV 8718 
Pvul 9368 
EcoRV 9567 



Figure 18 



WO 01/79472 



PCT/US01/12334 



19/40 



Pvull 121 
i EcoRI 296 
Sacl 306 
Sad 645 

EcoRV 1549 



EcoRV 17470 

17274 ^ 



Sac! 15721 
EcoRV 15623 & 
Clal 15609 B 
Pvul 15609 ^ 

Pvul 15037 
Xhol 14394 




Pvul 13963 
Dral 13808 
Sacf 13689 
Pvull 13679 
Bglll 13407 
Xhol 12796 
Pvul 12288 
Pvull 12153 
Sacl 12041 
Bglll 12018 
Smal 12003 
EcoRI 11673 
EcoRI 11525 
Pvul 11385 
Pvull 11136 
Ncol 11042 
EcoRV 10986 

Sail 10349 
Pvul 10314 
Xhol 10284 
Bglll 10228 
EcoRV 10097 
Pvul 9898 



Xbal 1578 
EcoRI 1773 
iBamHI 2080 
jNotl 2087 
Pvul 2156 
EcoRI 2180 
Smal 2246 
EcoRV 2579 
Ncol 2847 
EcoRV 3075 
Dral 3152 
Pstl 3266 
BamHI 3585 

EcoRV 4429 
Pvull 4628 
Pvul 4659 
Xhol 4708 
Pvull 4944 
EcoRV 5239 
Sail 5449 
EcoRI 5461 
Pvull 5716 
Ncol 6042 
EcoRI 6451 
Smal 6458 



Xhol 7600 
Pvul 7610 
Dral 8062 
Pvul 8146 
Xhol 8591 
Smal 8755 
EcoRV 8786 
EcoRV 8958 
EcoRV 9248 



Figure 19 



WO 01/79472 



PCT/US01/12334 



20/40 



Pvull 121 
EcoRi 296 
Sacl 306 
Ncol 346 
Stul 486 
Ncol 736 

EcoRV 1444 



EcoRV 16742 
Bglll 16646 



Sac! 14993 
EcoRV 14896 
Clal 14881 
Pvul 14881 

Pvul 14309 

Xhol 13666 

Pvul 13235 
Dral 13080 
Sacl 12961 
Pvull 12951 
Bglll 12679 
Xhol 12068 
Pvul 11560 
Pvull 11425 
Sacl 11313 
Bglll 11290 
Smal 11275 
EcoRI 10945 
EcoRI 10797 
Pvul 10657 
Pvull 10408 
Ncol 10314 
EcoRV 10258 

Sail 9621 
Pvul 9586 
Xhol 9556 



BamHI 1659 
Notl 1666 
Hindlll 1697 
Smal 2015 
Dral 2040 
Ncol 2103 
Pvull 2196 
Ncol 2359 
Ncol 2391 
Ncol 2514 
Pvull 2540 
Pstl 2647 
Sacl 2653 




Pvull 3900 
Pvul 3931 
Xhol 3980 
Pvull 4216 
EcoRV 4511 
Sail 4721 
EcoRI 4733 
Pvull 4988 

Ncol 5314 



EcoRI 5723 
Smal 5730 



r ^ Xhol 6872 
Pvul 6882 
Dral 7334 
Pvul 7418 
Xhol 7863 
Smal 8027 
EcoRV 8058 
EcoRV 8230 
EcoRV 8520 
Pvul 9170 
'EcoRV 9369 
Bglll 9500 



Figure 20 



WO 01/79472 



PCT/US01/12334 



21/40 



^ CO H CO H 

vd vo co c- 



CN O VO 

in ifl in h in 

H H r-t H H 



H ^ H m vo 
^ n \t» vo ^ 

cn cn <m ra c? 



H CD rij E±M 

.. W H H 

o y u cd 

CO 





_ . hJ CD 

O H S CO hi J 
(N Jj H ^ J 2 

Pi Eh En 

to S CD 
hi s . 
Pi to a 

t 
i 

a 
cd 




os 

3: 



Qt 



ft 
i 
i 

h W H 

< Cu M 

a slhi 
h crrcai 

£ £ 

ri! > > cq 




ft 

CO 
I 
I 

< 
hi 

ft 

TO1 



co O co CP 

CD 
H 






CD_ CD M 

O g > 3 Eh J 

CQ rf M LQ B 

fe pj Q H CD 

CD rf| ft CD 

H CO 




H O W H Q 

bn H 



o oi P4 

O 
H 



Q:G Q Q 



Q O Q 
3: S >* 




CD 






•H 


Q 


r- 


CO 




Q 


IP 


fe 


in 


t> 




r*i 





* rij H ft H W 

cT55£OTCDt|j 

fe ' 

hi « g 
^ 5* & ^ 

0< Ot H 



CO 



h :h 



CD 

^ ISL 

* cd <; o* & a 

05 W W 0) 

fe CD> < CD 

BEjH Eh 

" i i 

H 

Qt fe H 




L££i05 « u 

Q p h co a 

H^SpitSfe fe 

r . § co 

> g > U 

O M H g > 

CD ft Oh 




„ < Eh Eh 
Eh Q £ Oi 

CD SSg ^co 
hi I I W Sh 

CP I I 1 PH 

H I I ^ 




ft rtj ft 

ft w CD 

CD <H 

rij O J 



00 (N ^ > 

o cn H cn o 
co oi m n n 



o 
m 



fe 

H 



CO 

fe 

* ft 



t a 




i hi 




i hi 




i j 




i Eh 




i CD 




i £ 




' S 1 




! g 








i hi 




i a 








i E 


H 1 


I o* 


CD CD 




o 

o? m 

m hi CD 

!fc$fe 

CO ft 
Ot fe Qt 
H CD tgj < 

p H hi < 

St 1 I I 

fe I CD Ol 

co Ot 

'I? 

& ft Q S 
CO CD J 
W 



p4 »ft Pi Oi 





CM 

£ 
3 



u) io cn co 

(«) (N ffl UI r) 

o\ c» o in 

H O H O r4 

& & A K & 

hi J hi J hi 

CO CO CO CO CO 



u> vo cn co 

cm in h 

r- m oo o m 

H o h o h 

oj o< h7 oi pi 

^ hi hi J hi 

co co co co co 



io to cn vo co 

o cn cn in h 

c* cn co o m 

H o h o H 

OS J g oi 

fr^l 

co co co co co 



vo u> cn vo co 

ro cn cn in h 

r** cn co o m 

H o h o H 

g g J (4 Pi 

CO co co CO CO 



WO 01/79472 



PCT7US01/12334 



22/40 



O i (S\ i r-» m T}< 

co co vo vo 



CO 
CO 
CO 

s 
o 
u 



ocr\ovoooo(nrarocrk 
^^•F-wrovoror'-voin 
H H h h 



CNHCNHOJHfSHHH 



H 

ft 



VO 

m 

H 



CO 



3 
S 

CO 

% 



a 

CO 

3 

CO 
CO 
CO 
CO 

ft 

CO 

> 

co 

a 

3 
* 

CO 
h3 
O 

En 




o 




CO 


! b 


E 
s 


i ? 




I ft 


H 


1 CQ 


0 






! P 






|h 






Eh 


CO 


CO 


W 


Pm 


H 


CD 


CO 


CO 








i 




P4 



ft 
Or 
co 
h 
co 



6 

W 

oi 

ft 
H 
hh- 
PS 



04 

CO 
CQ 
CO 

> 

CQ 
H 
f* 
CO 
CO 



CO 



Eh 

fa 

CQ 

a 

CO 



CO 

I 

Pi 

CO 

I 

co 

CO 

K 
w 

CO 



vo 

n o\ ^ co 

1H O H H 



ft ft 

£ ^ £ 
< co < 



cn vo 
o\ (N in 

H 

Eh 



CO 
H 



o co in 

o H H 

hi ft pi ft m 

J Ej J H J 

CQ < CO < CQ 



H W 




VO VO 

m eg 

oj r-* ro en t*« 

H H E-i O H 

ft P4 ft ft! ft 

frj ^ £ ^ Eh 

< CQ < CO *t 



cn vo co 

o\ CN in H 

co h o co in 

H H O B H 

i4 ft pei ft pj 

Ej a Ej J 

CO rij co «C co 



vo vo 

cn t> m cn ^» 
H H Eh O Eh 



0^ VO 00 

cn in H 

CO rH o oo m 

H Eh o Eh h 



gtffttfft^ftpsjftpj 

rflco^cQ^oiScoSco 



WO 01/79472 



PCT/US01/12334 



23/40 



Pj S5 H > to 

E g a >* t 
i i i i 

IgMBMttMib^ ft 

Ml H fa &T Q 




fO fa fa CO rfj 
< J J U 

fa i ft t t 




Pj Oi CO (< ^ 
Et Eh CM Q 

|ii H d! ^SjiFc^ 




IX) <X) 

ro M 

oi n (Ti ^ 

E-t H &h o eh 

ft pi ft # ft 

Ej J H J H 

rfj CO rfj CO tfj 



OA 

0\ CN If) 
CO H O 
Eh 
ft 

!2 



ri 

a 

CO 



o 

a 

CO 



00 
H 
00 IT) 

E^ H 
ft Pi 

Eh (J 

*< CO 




co 

O Oh 
—ZEE Pi GSi 

W W S £h CO 



Eh 

P4 



ro 
H 

9 

co 



rO CD H 

Eh O Eh H H 

Oi 05 ft 

!H J fH 

ril co < 



cn u> co 

cn <n in H 

o co in 

_ o H H 

ft (U ft 

Ej J 

CO (4! CO 



o 

CO 



o 



1 CO 1 


1 H VD I 


1 M I 


o 


O M 


CM 


CO 


^ ro 


ro 




I 1 I 






1 CO | | 






t >* 1 






1 fa I 1 






I W i i 






I O* ) , 






1 < 1 1 






l ft I 1 






■ Mil 






1 fa 1 1 






1 CM i i 






1 fa 1 1 






I ft 1 | 

! 2 : : 






YAS 












if- 






i A i l 






I ft i i 






i a i i 












i | i i 






i 3 i i 

i a i i 






iii i 






l (x) l 1 






i CD I i 






1 CO 1 1 






t > CP i 






i co i4 i 






i S t-3 i 






1 £ M" I 






i J Eh r 






! 9 8 1 






i S i 






i w a i 












i £ 2 i 




! b ! 


! 5 3 ! 




i a a t 




1 H 1 


i a n i 






IS 11 








! « ! 



i4 Eh 
CO < 



10 

ro 
H 

CO 



<x> 

ro cn 
Eh o 

< CO 



IX) 



ft 



cn 

cn c* in 

CO H O 

H Eh 

t-3 ft 

CO 14! 



00 
H 

co in 
O Eh H 

co dj co 



WO 01/79472 



PCT/US01/12334 



24/40 




WO 01/79472 



PCT/US01/12334 



25/40 




Plant line number 



Figure 24 



WO 01/79472 



PCT/US01/12334 



26/40 



750 -i 




a 



Plant line number 



Figure 25 
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Total tocopherol in Napin 
ATPT2 Canola Seed 



E 

o 

o 



300 



200 - 



o ° 
o « 

73 
o 

D) 

c 



100 - 




SltMf:.j;' i iMl:': 



/^••illpJiW 



9290 control 



10822-1 9290 control 

Plant Line 



In*.' 

if* J r L 



?v :i !! il'sstj** 



35 DAP 
Effla 45 DAP 



10822-1 



Figure 28 



WO 01/79472 



30/40 



PCT/US01/12334 



A 



737 sM7^ 17 lri740 anr^ 1743 s^" 746 



l> t> 



SfllSZI 



sin 620 




sin SI 8 




lkb 



b 



1 

b 




400- 



0 



8001 



400- 



■ 2-MelhyI-6-Phytylplastoquinoiie 



10 



2-Methyl-6-Phytylplastoquinonf 



A/auto*,*** .W.jf^ — 



0 



10 



Retention time [min] 



Figure 29 



WO 01/79472 



PCT/US01/12334 



31/40 



Ul 



TTtttPIBlMMKICttM 




& 



CO 



1 





fUNA 



40- 



30' 



I 



B 



o 



«* 



to- 



1. J. 



A. 



1 




4D- 



o 

i 



A. IuLoJ 



o 
o 

tS 



C 



I'll 




(UJt EMU, TT [GBOUftmMMDMI* 




Figure 30 



WO 01/79472 



PCTYUS01/12334 



32/40 



Query Sequence: F4D11 AL022537 

Database : PIR_T04448 . atcea , list . f asta 

Database: PIR_T04448 
Plus (+) denotes forward strand, and minus (-) reverse strand. 
Asterisks {*) denote bases not shown on pair wise alignments. 

Alignment 1 



Query- 
genomic 

ATCEA4C371+ 



Met 
Query- 

ATCEA4C37I+ 



12194 CACACGTTCTCGTCCTTTTCTTCTTCCTCTCTGCATTCTTCACAGAGTTTGTCACCACCA 



C. est 



: first 




12134 MBi|fl|gtffl{Ml|JBBBiHIIII^ 

It IMMII-tlMtllllilllllllM IMIilitMlllllNllllMMIttll 

2 ACCCCAAACATCACAArTTCACATTCTTTTGCATATTTCTTCTTCTTCTTCCATTATGGA 



Query- 
ATCEA4C371+ 



12075 GATACGGAGCTTGATTGTTTCTATGAACCCTAATTTATCTTCCTTTGAGCTCTCTCGCCC 

! Ill M IIIIIII IliMMIHIII IIIDIM ||])] MMMIIIH HIMI1IM 

62 GATACGGAGCTTGATT6TTTCTATGAACCCTAATTTATCTTCCTTTGAGCTCTCTCGCCC 



Query- 
ATCEA4C371+ 



12015 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGTTCCCCGCTC 

iMiiiMiiiiiiiMimi minium imitmmiii minimi 

122 TGTATCTCCTCTCACTCGCTCACTAGTTCCGTTCCGATCGACTAAACTAGTTCCCCGCTC 



Query- 
ATCEA4C371+ 



11955 catttctagggtttcgHIMatctccaccccgaatagtgaaactgacaagatctccgt 
1 illMI MM 111 Mill III1MI 111! MMUilll III! II ill! Mllllllll 

182 CATrTCTAGGGTTTCGGCGTCGATCTCCACCCCGAATAGTGAAACTGACAAGATCTCCGT 



Query- 11895 
ATCEA4C371+ 242 
here 



t I 1 1 I 1 1 I 1 1 1 M 1 I ) 1 1 M I 1 [ 1 i I I 1 I 1 1 I 1 1 I 1 1 1 ) I I 1 1 I I )UJJ.|J )|Tli 



Synecho seq aligns from 



Query- 11835 AATTGATCCATTCCATTCCATTTCTCTTCTCTTGTTTGTTTTATTAAGCTCCAATTTCAG 

ATCEA4C371+ 299 



Query- 
ATCEA4C371+ 



PIR:T04448 



-v- 60 bp removed 



11715 ******************************************************+* 



TTTG 



299 



Query- 

ATCEA4C371+ 

PIR:T04448 



11655 GTGGCTCACCATTCGACGACTACTTTTGAATTTGAGTTTTTGAAAAATGCAATTTAACAT 



299 



. * • • • * i ■ 



M Q F N I 
arab sequence which is incorrect 



Figure 31A 
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Query- 

ATCEA4C371+ 

PIR:T04448 

Query- 

ATCEA4C371+ 

PIR.-T04448 

Query- 

ATCEA4C371+ 

PIR:T04448 

Query- 

ATCEA4C371+ 

PIR.-T04448 

Query- 

ATCEA4C371+ 

PIR:T04448 
ATCEA4C371+ 

Query- 

ATCEA4C371+ 

PIR:T04448 
PIR:T04448 



: • : - : . : 
11595 CAGAGAGTTTTTTTTTTTATGGTTGATAACTTATTGTTTAACTTTTGAAAAATGCA^VTk 



299 



6REFFFLWLITYC 



■JJL1. 

» • « • 

<■ • ■ * 

LTFEKCRY 



11535 CCATTTCGATGGAACACCTCGGAAGTTCTTCGAGGGATGGTATTTCAGGGTTTCCATCCC 
Ml 1 M I !! 1 j H! I Ml I M | ! 1 1 M I 1 1 1 M 1 1 1 | 1 1 1 m 1 1 HMIlf f 1 1 1 1 1 1 1 
302 CCATTTCGATGGAACACCTCGGAAGTTCTTCGAGGGATGGTATTTcJM 



* * a * • 



26 HFDGTPRKFFEG w"*Y *F 



S I 



• • : • i . : . : • 
11475 AGAGaAGAGGGAGAGTTTTTGTTTrATGTATTCTGTGGAGAATCCTGCATTTCGGCAGAG 

^ 0 1 1 1 1 1 1 1 1 1 1 ' 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 ) t h 1 1 

3 62 AGAGAAGAGGGAGAGTTTTTGTTTTATGTATTCTGTGGAGAATCCTG^ 



* * * • 



46 E K R E 



t * * m 



* • • # 



* a » • • 



■ a * * a 



FCFMYSVENPAFRQS 



11415 TTTGTCACCATTGGAAGTGGCTCTATATG^CCTAGATT^CTGGTGTTGGAGCTCAGAT 

1 1 1 m 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 n 1 1 1 1 1 1 1 1 1 1 mil mm 

422 TTTGTCACCATTGGAAGTGGCTCTATATGGACCTAGATTCACTGGTGTTGGAGCTCAGAT 
s::!:::::::::::::::::::::::::::::::::::;::: . .... . . . .... :: . 

66 h S Pl*EVAIiYGPRFTGVGAQI 



11355 TCTTGGCGCTAATGATAAATATTTATGCCAATACGAACAAGACTCTCACAATTTCTGGGG 

„ M 11 '111 Ml II II II I Mil II Ml | || Ml 1 1| || i| || i| M , |! |, ,,,,,,.„„ 

482 TCTTGGCGCTAATGATAAATATTTATGCCAATACGAACAAGACTCTCACAATTTC 

:::::::::::::::::::::::::: : : : : : : : : : : : : : : : : : - : : . : . : : : : : : ; : : . . 

86 L G ANDKYLCQYEQDSHNFWG 

11301 Confidence: 100 100 



Exon 



11538 



i . : . : 
11295 AGGT AACTCCTTGAC CCTTAAAATGCTGTGTC ATGACAATAAGAAATC ATATCT GAGTCl 

537 



106 D 
Exon 



11609 11294 Confidence: 100 100 



Query- 11235 TTTCTCTACTTCTAGTACTAATGTTCGTTATTGTTGTTA^GATCTAAGTCTTATCTGAA 

PIR:T04448 107 

Query- 11175 TTTTGTTACATTTTGGTTCTGGTGCTTTCTCAACATGAATTTGTATATATGACTTTAAAG 



PIR:T04448 

Query- 
PIR:T04448 



PIR.T04448 
PIR:T04448 



107 



11115 ATTGCTT AC CTAAAGTTT TTACTCATGCAT AGATCGAC^ 



107 



RHELVLGNT 



Query- 11055 TTTTAGTGCTGTGCCAGGCGCAAAGGCTCCAAACMGGAOTTTC^CCA^GGTTCTCAC 



..... -•*•••••••••»••■•«» 



116FSAVPGAKAPNKEVPPE 
Exon 11083 11004 Confidence: 96 100 



Figure 31 B 
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Query- X0 995 ^ 3 j^^TGTTATCTGTT AAAT AGTTTTT CC AATTGT ATC CGGAT AGTP 

PIR:T04448 133 

Query- 10935 GTTCTACTT^ 

PIR:T04448 133 

Query- 10875 TTGMTTGT^gcW 

PIR:T04448 133 ' ! : ' ! :: " s :::::::::::::::::: : 

EFNRRVSEGF 

Query- 10815 ^^^^^ctc^ttttgg^tcaaggtcacatttgcgatgatggccggtaattatatga 

••••••••••■••••••iiillJliii' 

PIR:T04448 143 Q A T P F W H Q G H 'j ' c" i>" D ' G ' n 

PIR:T04448 Exon 10844 10768 Confidence: ?00 100 

Query- 10755 TTCTATGCACM^ 

PIR:T04448 159 

Query- 10695 ^^^^^^^^^^^^TCTGACTTGACTTGTTTTGTCAGTaCTGACTATGCGGARACTG 

PIR.-T04448 159 •::::::::::::::::::: 

T D Y A E T V 

Query- 10635 TGAAATCTGCTCGTTGGGAGTATAGTACTCGTCCCGTTTACGGTTGGGGTGATGTTGGGG 

PIR: T 04448 1 66 ' T " a ' T T 'i ' 'J ' V »' V T 'i ' V T V V V =0 = V T ' k 

• ■ • I • • 

Query- 10575 CCAAAC^GAAGTCAACTTCAC^ 

PXR:T04448 186 ' =K ' ^ = T 1 T T V V ^ 1 V T T V »' »' y» 

Query- X0 5 15 AGATATGCATGGCAGGAGGCCTTTCCACAGGTGTGAGCTTTGCTTGATTGACTTAAA 

PIR:T04448 206 I C M A G G L S t"g 

PIR:T04448 Exon 10655 10486 Confidence: 96 100 

Query- 10455 ^TA^TAGACGGTT^ 

PIR:T04448 216 ■ 

Query- 10395 TCT ATC AGC AGAAACTGCT AT T GT AGT T C T^T JkT TTT T TCT CTTT GT ATT TGC AG G 

PIR:T04448 216 * 

Query- 10335 GTGGATAGAATGGGGCGGTGAAAGGTTTGAGTTTC 

PXR: T 04448 216 v T Y V V T T T y T v v T T V V Y T 

Query- 10275 GAATTGGGGTGGAGGCTTCCCAAGAAAATGGTTTT 



PIR:T04448 236 N W G G G F P R k M f'V 

PIR:T04448 Exon 10336 10239 Confidence: 96 100 



Figure 31 C 
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Query- 
PIR:T04443 

Query- 
PIRiT04448 

Query- 
PIR:T04448 

Query- 

PIRiT04448 
PIR:T04448 

Query- 
PIR:T04448 

Query- 

PIRiT04448 

GSDBiSi495- 

Query- 

PIR:T04448 

GSDBiSi495- 

PIR:T04448 

GSDB;S:495- 

Quexy- 

PIRiT04448 

GSDB:S:495- 

Query- 

PIR:T04448 

GSDB:S:495~ 

Query- 

PIR.-T04448 

GSDB:S:495- 

Query- 

PIRiT04448 

GSDB:S:495- 



■ w 

10215 ACATTTCTTGTTGCAGACTTTAGTTAGCTAGTG 
248 

10155 CTTT ATTT GT CAAT GT CT CTT TAC AGGTCC AGT GTAATGTCTTTGA 
248 "•••■•••••■•••••••II 

VQCNVFE 

10095 AGGGGCAACTGGAGAAGTTTCTTTAACCG(bvGGTGGCGGGTTGAGGCAATTGCCTGGATT 

::::::::: ::::::::::::::::::::::-::::::: SSS!! .77 

255 GATGEVALTAGGGLR Q £ p" s'T 

10035 GACTGAGACCTATGAAAATGCTGCACTGGTATGCACTTATAAGATCTTCTTAAGCAATGA 
i : : : i : : ; : ; : : : ; : : : : : ; : : ^ 

275 TETYEKAAL 

Exon 10115 10008 Confidence i 100 100 

9975 CAGTGAGTATTAGAAG^ 

284 : : 

V 

9915 TTGTGTACA^ 

285 v'riTvvv'MTv'pTTT^vT'^RT 

532 inm 

tagatg 

9855 GGAAATGTCTCCCTGGGG ^ATTG^TATATAACTGCAGAGAACGAAAACCATGTGGTAA 

:::::::::::::::::: ::::!:::::::::•:;:;.......... 

305 EMSPWG YWYITAE N E N* *H* "v* * 

526 iiiiU"i'!' Il i Ill, ^I , U l 1111,1,1 > »l 1 I I I I I II IIIIIIIIH 

526 ggaaat tctccctgggggttattggtatataactgcagagaNcgNaaaccatgtg 

Exon 9917 9801 Confidence: 100 100 

Exon 9861 9801 Confidence : 93 93 

9796 AimGTTT^ 
323 " 
471 

9736 GATTCCAACACCCGATGAATGTCTTGTGACAGGTGGAACTAGAGGCAAGA^ 

°_ VELEARTNEA 
471 HIM! [IIIMII illlllMIII || 

gtggaactagaggcNagaacaaatgaag 

•••:.: • . 

9676 CGGGTACACCTCTGCGTGCTCCTACCACAGAAGTTGGGCTAGCTACGGCTTGCAGAGATA 

333 '^^'i'Ta'TT^TT v'vT A !S i !8 i ,, A ,,,,, A SS8 

443 cgggtacacctctgc^gctcctaccacagaagttgggctagctacggcttgcagagata 

• I • I • • 

9616 OTGTTACOTT^ 

••"•'•"•■••••J 11 !!!":::::::::::::::::::::::: 

353 C y G B 1 K L Q I W E R h Y D G 's"k'"g"k 

,„ iU'U 1 '"! 1 "!! IIIMIIIIIIIIIIIIII IIIIIIIIH III 

383 gttgttacggtgaattgaagttgcagatatgggaacggctatatgatggaagtaaaggca 



Figure 31 D 
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Query- 95 56 AGCTATCTATGC^ 

PIR:T04448 373 ; : : : : : : :::::::::::::::::: : 

,| L K_V L T N P K A 

GSDB:S: 495- 323 ag 

GSDB T S 4 495 ?* E2 4 9555 C °nf" e nce: 100 100 

GSDB. S. 495- Exon 9704 9555 Confidence: 98 100 

Query- 9 496 ^^^^SAAGATTATGflACGTrTGTTATGGTTAaCAATGATGCAGGTGATATTAGAGAC 

PI*:T0444 8 382 V T J^i* ' 'i' 'i' 'i* ^"i* ^' *i ' 'i* "i" 'i' 

GSDB:S:495- 321 flUMMIMIII 

gtgatattagagac 

Query- 943 6 AAAGAGCTCAATGGCAGCAGTGGAGATAG(LiGGAGGACCGTGGTTTGGGACATGGAAAGG 

PIR:T04448 402 Vii^TT^i'i'rG Tp's i's 

Query- 93 76 AGATACGAGCAACACGCCCGAGCTACTAAAACAGGCTCTTCAG^TCCCATTGGATCTTGA 

PIH:T04448 422 V V T v v T T T V Y ^ T V v V V T v T s ; s 

' S • 4SS " 247 a ^ tac ^5caacacgcccgagctacta a aacaggctct tLggticiatUgaiciiga 

: : . : 

(3t0p)^ 9316 ^ GCGCCTTA ^ TTTGGTC ^^ 

I::::::::::::::::::::::;: : ; ; , 

PIR:T04448 442 SALGLVPF F K P i"a'L" 

PIR:T04448 456 

GSDB-S-495- 127 '111 'II 1 1 ' 1 ' » ' I I I I I I I I I I I I I I | | I | | | | | | I I I I I I I I I | | | | | | | 

gsdb . s . 495 127 ttgtttgttgatagagacccatgtgatgaatgaaffccttagtcatgtcattgctagcttc 

Query- 9 1 96 »«ATTATGTATGTATGATTTT A GT^ 

G SDB ,, 95 - 67 l^aigiaigl^^^ 

• ! . ; • ■ 

Query- 913 6 GTAAAGTC^ : 

GSDB:S:495- 7 gtaaagt 

GSDB: S : 495- Exon 9450 9130 Confidence: 98 100 

^ffi^«L^? 3/ ^ ,Cm85M ■ 1 ■ , 4 ' 0e " 43 (AL022537 > P-tem 

PIR:T04448 3PIR-T04448 shypothetical protein F4D11 30 - ArahidoT^i « 

53063693 , en* | CAA18584 . 1 (A1022537 , paLti^p^& 3 [ ta aE£S2 'tLuS J^illll . 30 

farih,-^ 55 - "I* 1 ?? 5392 IM995392 1 701673779 A. thaliana, Columbia Col-0, inflorescence- 
1 Arabidopsis thaliana cDNA clone 701673779, mRNA sequence. inflorescence- 



Figure 31 E 
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50 108 130 200 £59 300 350 

Position 



Figure 32 
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Position 



Figure 33 
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HO 



> 

B: ' Phytoplastoquinol 




i 



Tocopherol 
Cyclase 




R = HorCH 3 



Tocopherol 



Chalcone 




i 



Chalcone 
Isomcrase 



Lycopene 



Lycopene 
Cyclase 




Carotene 




Fianesyl diphosphate 



PR 



Aristolochcne 
Synthase 




Germacreae 



Figure 34 
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sir 1 7 3 7_S YNS P__S 7 4 8 1 4 
sir 1 7 3 7_ARATH_T 04 448 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
slrl7 37_ARATH_T04 448 
CFI_ARATH_P41088_ 

slrl737_SYNSP__S74814 
slrl7 37_ARATH_T04 4 4 8" 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
sir 1 7 3 7_ARATH_T 0 4 4 4 8" 
CFI_ARATH_P41088_ 

slrl737_SYNSP_S74814 
s lr 1 7 3 7_ARATHJT 04448" 
CFI_ARATH_P 4 1 0 8 8_ 

slrl737_SYNSP_S74814 
s 1 r 1 7 3 7 _ARAT H_T 0 4 4 4 8 " 
CFI_ARATH_P4 1 0 8 8_ 

slrl737_SYNSP_S74814 
slrl737_ARATH_T04448~ 
CFI_ARATH_P 4 1 0 8 8_ 

slrl737_SYNSP_S74814 
s 1 r 1 7 3 7_ARATH_T 0 4 448" 
CFI_ARATH_P4 10 8 8_ 

slrl737_SYNSP_S7 4814 
s lr 1 7 3 7_ARATHJT04 448" 
CFI_ARATH_P4 1 0 8 8_ 

slrl737_SYNSP_S74814 
s lr 1 7 3 7_ARATHJT 04 448" 
CFI ARATH P41088 
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MEIRSLIVSMNPNLSSFELSRPVSPLTRSLVPFRSTKLVPRSISRVSASI 



KFP PHSGYHWQGQS-PFFEGWYVRLL 
STPNSETDKISVKPVYVPTSPNRELRTPHSGYHFDGTPRKFFEGWYFRVS 



LPQSGESFAFMYS IENPAS DHHYGGGAVQILGPATK KQENQEDQLV 

IPEKRESFCFMYSVENPAFRQSLSPLEVALYGPRFTGVGAQILGANDKYL 

MS S SNACAS PSPFPA VTKLHVDS V- 

WRTFPSVKKFWASPRQFALG-HWGKCRDNRQ-AKPLLSEEFFATVKEGYQ 
CQYEQDSHNFWGDRHELVLGNTFSAVPGAKAPNKEVPPEEFNRRVSEGFQ 
— TFVPSVKSPASSNPLFLG-GAGVRGLDIQ-GK FVIFTVIGVY 

IHQNQHQGQIIHGDR HCRWQFTVEPEVTWGS PNRFPRATAGW 

AT P FWHQGHI CDDGRTDYAETVKSARWEYSTRPVYGWGDVGAKQKSTAGW 
LEGNAVPSLSV KWKGKTTEELTESIPFFREIVTGAF 

LSFLPLFDPGWQILLAQGRAHGWLKWQREQYEFDHALVYAEKNWGHSFPS 
PAAFPVFEPHWQICMAGGLSTGWIEWGGERFEFRDAPSYSEECNWGGGFPR 
EKFIKVT M KLPLTGQQYSEKVTENC 

RWFWLQANYFPDHPG-LSVTAAGGERIVLGRPE EVALIGLHHQGNFY 

KWFWVQCNVFEGATGEVALTAGGGLRQLPGLTETYENAALVCVHYDGKMY 
VAI WKQLGL YTDCEA-KAV EKFLEIFKE ET 

EFGPGHGTVTWQVAPWGRWQLKASNDRYWVKLSGKTDKKGSLVHTP-TAQ 
E FVPWNG WRWEMS PWG YWYI TAENENHWELEARTNEAG T PLRAPTTEV 

-FPPG-SSILFALSPTGSLTVAFSKDDS-IPETGIAVIENKLLAEA-VLE 

GLQLNCRDTTRGYLYLQLGSVGHG LIVQGETDTAGLEVGG 

GLATACRDSCYGELKLQIWERLYDGSKGKVILETKSSMAAVEIGGGPWFG 
SIIGKNGVSPGTRLSVAERLSQ LMMKNKDEKE VS DHSL 

DWGLTEENLSKKT VPF 

TWKGDTSNTPELLKQALQVPLDLESALGLVPFFKPPGL 
EEKLAKEN 



Figure 35 
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SEQUENCE LISTING 



<110> Subramaniam, Sai 
5 Slater, Steven 

Karberg, Katherine 
Chen, Ridong 
Val ent in , Henry- 
Huang Wong, Yun-Hua 

10 

<120> Nucleic Acid Sequences Involved in 
Tocopherol Synthesis 



<130> MOCO.008.00WO 

15 

<150> 09/549,848 
<151> 2000-04-15 " 



<150> 09/688,069 
20 <1S1> 2000-10-15 



<160> 94 



25 



30 



35 



<170> PastSEQ for Windows Version 4.0 

<210> 1 
<211> 1182 
<212> DNA 

<213> Arabidopsis sp 
<400> 1 



60 



atggagtctc tgctctctag ttcttctctt gtttccgctg ctggtgggtt ttgttggaag 

aagcagaatc taaagctcca ctctttatca gaaatccgag ttctgcgttg tgattcgagt 120 

aaagttgtcg caaaaccgaa gtttaggaac aatcttgtta ggcctgatgg tcaaggatct 180 

tcattgttgt tgtatccaaa acataagtcg agatttcggg ttaatgccac tgcgggtcag 240 

cctgaggctt tcgactcgaa tagcaaacag aagtctttta gagactcgtt agatgcgttt 300 

tacaggtttt ctaggcctca tacagttatt ggcacagtgc ttagcatttt atctgtatct 360 

ttcttagcag tagagaaggt ttctgatata tctcctttac ttttcactgg catcttggag 420 



1 



WO 01/79472 
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gctgttgttg cagctctcat gatgaacatt tacatagttg ggctaaatca gttgtctgat 48 0 

gttgaaatag ataaggttaa caagccctat cttccattgg catcaggaga atattctgtt 54 0 

aacaccggca ttgcaatagt agcttccttc tccatcatga gtttctggct tgggtggatt 600 

gttggttcat ggccattgtt ctgggctctt tttgtgagtt tcatgctcgg .tactgcatac 660 

tctatcaatt tgccactttt acggtggaaa agatttgcat tggttgcagc aatgtgtatc 720 

ctcgctgtcc gagctattat tgttcaaatc gcctfcttatc tacatattca gacacatgtg 780 

tttggaagac caatcttgtt cactaggcct cttattttcg ccactgcgtt tatgagcttt 840 

ttctctgtcg ttattgcatt gtttaaggat atacctgata tcgaagggga taagatattc 900 

ggaatccgat cattctctgt aactctgggt cagaaacggg tgttttggac atgtgttaca 9 60 

ctacttcaaa tggcttacgc tgttgcaatt ctagttggag ccacatctcc attcatatgg 1020 

agcaaagtca tctcggttgt gggtcatgtt atactcgcaa caactttgtg ggctcgagct 1080 

aagtccgttg atctgagtag caaaaccgaa ataacttcat gttatatgtt catatggaag 1140 

ctcttttatg cagagtactt gctgttacct tttttgaagt ga 1182 

<210> 2 
<211> 393 
<212> PRT 

<213> Arabidopsis sp 
<400> 2 

Met Glu Ser Leu Leu Ser Ser Ser Ser Leu Val Ser Ala Ala Gly Gly 

15 10 15 

Phe Cys Trp Lys Lys Gin Asn Leu Lys Leu His Ser Leu Ser Glu lie 

20 25 30 

Arg Val Leu Arg Cys Asp Ser Ser Lys Val Val Ala Lys Pro Lys Phe 

35 40 45 

Arg Asn Asn Leu Val Arg Pro Asp Gly Gin Gly Ser Ser Leu Leu Leu 

50 55 60 

Tyr Pro Lys His Lys Ser Arg Phe Arg Val Asn Ala Thr Ala Gly Gin 
65 70 75 so 

Pro Glu Ala Phe Asp Ser Asn Ser Lys Gin Lys Ser Phe Arg Asp Ser 

85 90 95 

Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val lie Gly Thr 

100 105 no 

Val Leu Ser He Leu Ser Val Ser Phe Leu Ala Val Glu Lys Val Ser 

115 120 125 

Asp He Ser Pro Leu Leu Phe Thr Gly He Leu Glu Ala Val Val Ala 

130 135 140 

Ala Leu Met Met Asn He Tyr He Val Gly Leu Asn Gin Leu Ser Asp 



W0 01/79472 PCT/US0iyi2334 

145 155 iso 

Val Glu lie Asp Lys Val Asn Lys Pro Tyr Leu Pro Leu Ala Ser Gly 

165 170 1?5 

Glu Tyr Ser Val Asn Thr Gly lie Ala lie Val Ala Ser Phe Ser lie 
5 180 185 190 

. Met Ser Phe Trp Leu Gly Trp He Val Gly Ser Trp Pro Leu . Phe Trp 

195 200 . 205 

Ala Leu Phe Val Ser Phe Met Leu Gly Thr Ala Tyr Ser He Asn Leu 
210 215 220 

10 Pro Leu Leu Arg Trp Lys Arg Phe Ala Leu Val Ala Ala Met Cys He 

225 230 235 2 40 

Leu Ala Val Arg Ala He He Val Gin He Ala Phe Tyr Leu His He 

245 250 255 

Gin Thr His Val Phe Gly Arg Pro He Leu Phe Thr Arg Pro Leu He 
15 260 265 270 

Phe Ala Thr Ala Phe Met Ser Phe Phe Ser Val Val He Ala Leu Phe 

275 280 285 

Lys Asp He Pro Asp He Glu Gly Asp Lys He Phe Gly He Arg Ser 
290 295 300 

20 Phe Ser Val Thr Leu Gly Gin Lys Arg Val Phe Trp Thr Cys Val Thr 

305 310 iic 

315 320 

Leu Leu Gin Met Ala Tyr Ala Val Ala He Leu Val Gly Ala Thr Ser 

325 330 335 

Pro Phe He Trp Ser Lys Val He Ser Val Val Gly His Val He Leu 
25 340 345 350 

Ala Thr Thr Leu Trp Ala Arg Ala Lys Ser Val Asp Leu Ser Ser Lys 

355 3so 365 

Thr Glu He Thr Ser Cys Tyr Met Phe He Trp Lys Leu Phe Tyr Ala 
370 375 3 8 o 

3 0 Glu Tyr Leu Leu Leu Pro Phe Leu Lys 
385 390 

<210> 3 
<211> 1224 
35 <212> DMA 

<213> Arabidopsis sp 

<400> 3 

atggcgtttt ttgggctctc ccgtgtttca agacggttgt tgaaatcttc cgtctccgta 60 



3 



WO 01/79472 

actccatctt cttcctctgc tcttttgcaa 
actacccatt acacaaatcc tttcactaag 
gtatggagta aaggaagaga attgcatcag 
agattaattt gtggaatgtc gtcgtcttct 
5 gataaggaga agagtgatgg tgttgttgtt 
ccagaagaag ttagaggtta tgctaagctt 
ttgcttgcgt ggccttgtat gtggtcgatt 
agttttaaat atatggcttt atttggttgc 
actataaatg atctgcttga tcaggacata 

10 cctatcgcca gtggtctttt gacaccattt 
cttttaggct tagggattct tctccaactt 
tctttgttac ttgtcttttc ctacccactt 
tttttaggtt tgaccataaa ctggggagca 
atagcaccat ctattgtact ccctctctat 

15 gatactattt atgcacatca ggacaaagaa 
gcccttagat tcggtgataa tacaaagctt 
ggttttcttg cactttctgg attcagtgca 
gccgctgcat caggacagtt aggatggcaa 
gactgcagta gaaaatttgt gtcgaacaag 

20 gtacttggaa gaagttttca ataa 

<210> 4 
<211> 407 
<212> PRT 
25 <213> Arabidopsis sp 



PCT/US01/12334 



tcacaacata aatccttgtc caatcctgtg 


120 


tgttatcctt catggaatga taattaccaa 


180 


gagaagtttt ttggtgttgg ttggaattac 


240 


tcggttttgg agggaaagcc gaagaaagat 


300 


aagaaagctt cttggataga tttgtattta 


360 


gctcgattgg ataaacccat tggaacttgg 


420 


gcgttggctg ctgatcctgg aagccttcca 


480 J 


ggagcattac ttcttagagg tgctggttgt 


540 


gatacaaagg ttgatcgtac aaaactaaga 


600 


caagggattg gatttctcgg gctgcagttg 


660 


aacaattaca gccgtgtttt aggggcttca 


720 


atgaagaggt ttacattttg gcctcaagcc 


780 


ttgttaggat ggactgcagt taaaggaagc 


840 


ctctccggag tctgctggac ccttgtttat 


■ 

900 


gatgatgtaa aagttggtgt ' taagtcaaca 


960 


tggttaactg gatttggcac agcatccata 


1020 


gatctcgggt ggcaatatta cgcatcactg 


1080 


atagggacag ctgacttatc atctggtgct 


1140 


tggtttggtg ctattatatt tagtggagtt 


1200 




1224 



30 



<400> 4 

Met Ala Phe Phe Gly Leu Ser Arg Val Ser Arg Arg Leu Leu Lys Ser 

15 io is 

Ser Val Ser Val Thr Pro Ser Ser Ser Ser Ala Leu Leu Gin Ser Gin 

20 25 30 

His Lys Ser Leu Ser Asn Pro Val Thr Thr His Tyr Thr Asn Pro Phe 

35 40 45 

Thr Lys Cys Tyr Pro Ser Trp Asn Asp Asn Tyr Gin Val Trp Ser Lys 
35 50 55 60 

Gly Arg Glu Leu His Gin Glu Lys Phe Phe Gly Val Gly Trp Asn Tyr 
€S 70 75 80 

Arg Leu He Cys Gly Met Ser Ser Ser Ser Ser Val Leu Glu Gly Lys 

85 so 95 



4 
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Pro Lys Lys Asp Asp Lys Glu Lys Ser Asp Gly Val Val Val Lys Lys 

100 105 110 

Ala Ser Trp lie Asp Leu Tyr Leu Pro Glu Glu Val Arg Gly Tyr Ala 
115 120 125 

5 Lys Leu Ala Arg Leu Asp Lys Pro He Gly Thr Trp Leu Leu Ala Trp 
130 13 5 140 

Pro Cys Met Trp Ser He Ala Leu Ala Ala Asp Pro Gly Ser Leu Pro 

145 150 -ice 

U I 55 160 

Ser Phe Lys Tyr Met Ala Leu Phe Gly Cys Gly Ala Leu Leu Leu Arg 
° 165 170 17S 

Gly Ala Gly Cys Thr He Asn Asp Leu Leu Asp Gin Asp lie Asp Thr 

180 185 190 

Lys Val Asp Arg Thr Lys Leu Arg Pro He Ala Ser Gly Leu Leu Thr 
195 200 205 

5 Pro Phe Gin Gly He Gly Phe Leu Gly Leu Gin Leu Leu Leu Gly Leu 
210 215 220 

Gly He Leu Leu Gin Leu Asn Asn Tyr Ser Arg Val Leu Gly Ala Ser 

225 230 235 240 

Ser Leu Leu Leu Val Phe Ser Tyr Pro Leu Met Lys Arg Phe Thr Phe 
} 245 250 255 

Trp Pro Gin Ala Phe Leu Gly Leu Thr He Asn Trp Gly Ala Leu Leu 

260 265 270 

• Gly Trp Thr Ala Val Lys Gly Ser He Ala Pro Ser He Val Leu Pro 
275 280 285 

Leu Tyr Leu Ser Gly Val Cys Trp Thr Leu Val Tyr Asp Thr He Tyr 

290 295 300 

Ala His Gin Asp Lys Glu Asp Asp Val Lys Val Gly Val Lys Ser Thr 

305 310 315 320 

Ala Leu Arg Phe Gly Asp Asn Thr Lys Leu Trp Leu Thr Gly Phe Gly 

325 330 335 

Thr Ala Ser He Gly Phe Leu Ala Leu Ser Gly Phe Ser Ala Asp Leu 

340 345 350 

Gly Trp Gin Tyr Tyr Ala Ser Leu Ala Ala Ala Ser Gly Gin Leu Gly 

355 360 3 65 

Trp Gin He Gly Thr Ala Asp Leu Ser Ser Gly Ala Asp Cys Ser Arg 

370 375 38O 

Lys Phe Val Ser Asn Lys Trp Phe Gly Ala He He Phe Ser Gly Val 

385 390 395 400 

Val Leu Gly Arg Ser Phe Gin 
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15 



20 



25 



30 



405 

<210> 5 
<211> 1296 
<212> DNA 

<213> Arabidopsis sp 



<400> 5 

atgtggcgaa gatctgttgt ttctcgttta tcttcaagaa tctctgtttc ttcttcgtta 
ccaaacccta gactgattcc ttggtcccgc gaattatgtg ccgttaatag cttctcccag 
cctccggtct cgacggaatc aactgctaag ttagggatca ct ggtgttag atctgatgcc 
aatcgagttt ttgccactgc tactgccgcc gctacagcta cagctaccac cggtgagatt 
tcgtctagag ttgcggcttt ggctggatta gggcatcact acgctcgttg ttattgggag 
ctttctaaag ctaaacttag tatgcttgtg gttgcaactt ctggaactgg gtatattctg 
ggtacgggaa atgctgcaat tagcttcccg gggctttgtt acacatgtgc aggaaccatg 
atgattgctg catctgctaa ttccttgaat cagatttttg agataagcaa tgattctaag 
atgaaaagaa cgatgctaag gccattgcct tcaggacgta ttagtgttcc acacgctgtt 
gcatgggcta ctattgctgg tgcttctggt gcttgtttgt tggccagcaa gactaatatg 
ttggctgctg gacttgcatc tgccaatctt gtactttatg cgtttgttta tactccgttg 
aagcaacttc accctatcaa tacatgggtt ggcgctgttg ttggtgctat cccacccttg 
cttgggtggg cggcagcgtc fcggtcagatt tcatacaatt cgatgattct tccagctgct 
ctttactttt ggcagatacc tcattttatg gcccttgcac atctctgccg caatgattat 
gcagctggag gttacaagat gttgtcactc tfctgatccgt cagggaagag aatagcagca 
gtggctctaa ggaactgctt ttacatgatc cctctcggtt tcatcgccta tgactggggg 
ttaacctcaa gttggttttg cctcgaatca acacttctca cactagcaat cgctgcaaca 
gcattttcat tctaccgaga ccggaccatg cataaagcaa ggaaaatgtt ccatgccagt 
cttctcttcc ttcctgtttt catgtctggt cttcttctac accgtgtctb taatgataat 
cagcaacaac tcgtagaaga agccggatta acaaattctg tatctggtga agtcaaaact 
cagaggcgaa agaaacgtgt ggctcaacct ccggtggctt atgcctctgc tgcaccgttt 
cctttcctcc cagctccttc cttctactct ccatga 



<210> 6 
<211> 431 
<212> PRT 
35 <213> Arabidopsis sp 

<400> 6 



Met Trp Arg Arg Ser Val Val Tyr Arg Phe Ser Ser Arg. He Ser Val 
15 10 1S 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1296 



S 



WO 01/79472 PCT/US01/12334 

Ser Ser Ser Leu Pro Asn Pro Arg Leu He Pro Trp Ser Arg Glu Leu 

20 25 30 

Cys Ala Val Asn Ser Phe Ser Gin Pro Pro Val Ser Thr Glu Ser Thr 
35 40 45 

5 Ala Lys Leu Gly He Thr Gly Val Arg Ser Asp Ala Asn Arg Val Phe 
50 55 60 

Ala Thr Ala Thr Ala Ala Ala Thr Ala Thr Ala Thr Thr Gly Glu He 
65 70 75 so 

Ser Ser Arg Val Ala Ala Leu Ala Gly Leu Gly His His Tyr Ala Arg 
10 85 90 95 

Cys Tyr Trp Glu Leu Ser Lys Ala Lys Leu Ser Met Leu Val Val Ala 

100 105 no 

Thr Ser Gly Thr Gly Tyr He Leu Gly Thr Gly Asn Ala Ala He Ser 
115 120 125 

15 Phe Pro Gly Leu Cys Tyr Thr Cys Ala Gly Thr Met Met He Ala Ala 
130 135 140 

Ser Ala Asn Ser Leu Asn Gin He Phe Glu He Ser Asn Asp Ser Lys 
145 "0 155 16Q 

Met Lys Arg Thr Met Leu Arg Pro Leu Pro Ser Gly Arg He Ser Val 
20 165 170 175 

Pro His Ala Val Ala Trp Ala Thr He Ala Gly Ala Ser Gly Ala Cys 

180 185 190 

Leu Leu Ala Ser Lys Thr Asn Met Leu Ala Ala Gly Leu Ala Ser Ala 
135 200 205 

25 Asn Leu Val Leu Tyr Ala Phe Val Tyr Thr Pro Leu Lys Gin Leu His 
210 215 220 

Pro He Asn Thr Trp Val Gly Ala Val Val Gly Ala He Pro Pro Leu 
225 230 235 2 40 

Leu Gly Trp Ala Ala Ala Ser Gly Gin He Ser Tyr Asn Ser Met He 
30 2 « 250 255 

Leu Pro Ala Ala Leu Tyr Phe Trp Gin He Pro His Phe Met Ala Leu 

260 265 270 

Ala His Leu Cys Arg Asn Asp Tyr Ala Ala Gly Gly Tyr Lys Met Leu 
275 280 285 

35 Ser Leu Phe Asp Pro Ser Gly Lys Arg He Ala Ala Val Ala Leu Arg 
290 295 300 

Asn Cys Phe Tyr Met He Pro Leu Gly Phe He Ala Tyr Asp Trp Gly 
305 310 315 320 

Leu Thr Ser Ser Trp Phe Cys Leu Glu Ser Thr Leu Leu Thr Leu Ala 
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325 

lie Ala Ala Thr Ala 

340 

Ala Arg Lys Met Phe 
355 

Ser Gly Leu Leu Leu 
370 

Val Glu Glu Ala Gly 
385 

Gin Arg Arg Lys Lys 

405 

Ala Ala Pro Phe Pro 

420 



330 

Phe Ser Phe Tyr Arg Asp 

345 

His Ala Ser Leu Leu Phe 
360 

His Arg Val Ser Asn Asp 
375 

Leu Thr Asn Ser Val Ser 
390 395 
Arg Val Ala Gin Pro Pro 

410 

Phe Leu Pro Ala Pro Ser 

425 



335 

Arg Thr Met His Lys 
350 

Leu Pro Val Phe Met 
365 

Asn Gin Gin Gin Leu 
380 

Gly Glu Val Lys Thr 

400 

Val Ala Tyr Ala Ser 

415 

Phe Tyr Ser Pro 
430 



<210> 7 
<211> 479 
<212> DNA 

<213> Arabidopsis sp 



<400> 7 



ggaaactccc 


ggagcacctg 


tttgcaggta 


ccgctaacct 


taatcgataa 


tttatttctc 


60 


ttgtcaggaa 


ttatgtaagt 


ctggtggaag 


gctcgcatac 


catttttgca 


ttgcctttcg 


120 


ctatgatcgg 


gtttactttg ggtgtgatga 


gaccaggcgt 


ggctttatgg 


tatggcgaaa 


180 


acccattttt 


atccaatgct 


gcattccctc 


ccgatgattc 


gttctttcat 


tcctatacag 


240 


gtatcatgct 


gataaaactg 


ttactggtac 


tggtttgtat 


ggtatcagca 


agaagcgcgg 


300 


cgatggcgtt 


taaccggtat 


ctcgacaggc 


attttgacgc 


gaagaacccg 


cgtactgcca 


360 


tccgtgaaat 


acctgcgggc 


gtcatatctg 


ccaacagtgc gctggtgttt 


acgataggct 


420 


gctgcgtggt 


attctgggtg gcctgttatt 


tcattaacac 


gatctgtttt 


tacctggcg 


479 



<210> 8 
<211> 551 
<212> DNA 

<213> Arabidopsis sp 



<220> 

<221> misc_feature 
<222> (1} . . . (551) 
<223> n = A,T,C or G 
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<400> 8 



10 



<210> 9 
<211> 297 
15 <212> PRT 

<213> Arabidopsis sp 



60 



ttgtggctta caccttaatg agcatacgcc agnccattac ggctcgttaa tcggcgccat 

ngccggngct gntgcaccgg tagtgggcta ctgcgccgtg accaatcagc ttgatctagc 120 

ggctcttatt ctgtttttaa ttttactgtt ctggcaaatg ccgcattttt acgcgatttc 180 

cattttcagg ctaaaagact tttcagcggc ctgtattccg gtgctgccca tcattaaaga 240 

cctgcgctat accaaaatca gcatgctggt ttacgtgggc ttatttacac tggctgctat 300 

catgccggcc ctcttagggt atgccggttg gatttatggg atagcggcct taattttagg 360 

cttgtattgg ctttatattg ccatacaagg attcaagacc gccgatgatc aaaaatggtc 420 

tcgtaagatg tttggatctt cgattttaat cattaccctc ttgtcggtaa tgatgcttgt 48 0 

ttaaacttac tgcctcctga agtttatata tcgataattt cagcttaagg aggcttagtg 540 
gttaattcaa t 



551 



<400> 9 



20 



25 



30 



35 



Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 

15 10 15 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr He Leu Leu 

20 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu He Gly 

35 40 45 

Glu Ser Thr Asp He Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 

50 55 go 

He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 

65 70 t~ 

' u '5 80 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 go 95 

Val Val Met Gly Asn Lys Val Val Ala Leu Leu Ala Thr Ala Val Glu 

100 105 11Q 

His Leu Val Thr Gly Glu Thr Met Glu He Thr Ser Ser Thr Glu Gin 

115 120 125 

Arg Tyr Ser Met Asp Tyr Tyr Met Gin Lys Thr Tyr Tyr Lys Thr Ala 

130 135 140 

Ser Leu He Ser Asn Ser Cys Lys Ala Val Ala Val Leu Thr Gly Gin 
145 150 155 160 

Thr Ala Glu Val Ala Val' Leu Ala Phe Glu Tyr Gly Arg Asn Leu Gly 
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Leu Ala Phe Gin 

180 

Ala Ser Leu Gly 
195 

Thr Ala Pro lie 
210 

Val Val Asp Gin 
225 

Glu Tyr Leu Gly 

Met Glu His Ala 

260 

Thr Asp Asn Glu 
275 

Thr His Arg Val 
290 



165 

Leu lie Asp Asp 

Lys Gly Ser Leu 

200 

Leu Phe Ala Met 
215 

Val Glu Lys Asp 
230 

Lys Ser Lys Gly 
245 

Asn Leu Ala Ala 

Asp Val Lys Arg 

280 

lie Thr Arg Asn 
295 



170 

lie Leu Asp Phe 
185 

Ser Asp lie Arg 

Glu Glu Phe Pro 

220 

Pro Arg Asn Val 
235 

He Gin Arg Ala 
250 

Ala Ala He Gly 
265 

Ser Arg Arg Ala 
Lys 



175 

Thr Gly Thr Ser 
190 

His Gly Val He 
205 

Gin Leu Arg Glu 

Asp lie Ala Leu 

240 

Arg Glu Leu Ala 
255 

Ser Leu Pro Glu 
270 

Leu He Asp Leu 
285 



<210> 10 
<211> 561 
<212> DNA 

<213> Arabidopsis sp 



<400> 10 








aagcgcatcc 


gtcctcttct 


acgattgccg ccagccgcat gtatggctgc ataaccgacc 


60 


gcccctatcc 


gctcgcggcc 


gcggtcgaat tcattcacac cgcgacgctg ctgcatgacg 


120 


acgtcgtcga 


tgaaagcgat 


ttgcgccgcg gccgcgaaag cgcgcataag gttttcggca 


180 


atcaggcgag 


cgtgctcgtc 


ggcgatttcc ttttctcccg cgccttccag ctgatggtgg 


240 


aagacggctc 


gctcgacgcg 


ctgcgcattc tctcggatgc ctccgccgtg atcgcgcagg 


300 


gcgaagtgat 


gcagctcggc 


accgcgcgca atcttgaaac caatatgagc cagtatctcg 


360 


atgtgatcag 


cgcgaagacc 


gccgcgctct ttgccgccgc ctgcgaaatc ggcccggtga 


420 


tggcgaacgc 


gaaggcggaa 


gatgctgccg cgatgtgcga atacggcatg aatctcggta 


480 


tcgccttcca 


gatcatcgac 


gaccttctcg attacggcac cggcggccac gccgagcttg 


540 


gcaagaacac 


gggcgacgat 


t 


561 



<210> 11 
<211> 966 
<212> DNA 

<213> Arabidopsis sp 



10 
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10 



15 



20 



25 



<210> 12 
<211> 321 
<212> PRT 

<213> Arabidopsis sp 



<400> 12 

Met Val Leu Ala Glu Val Pro Lys Leu Ala Ser Ala Ala Glu Tyr Phe 

1 5 io ■ 15 

Phe Lys Arg Gly Val Gin Gly Lys Gin Phe Arg Ser Thr lie Leu Leu 
30 2 0 25 30 

Leu Met Ala Thr Ala Leu Asn Val Arg Val Pro Glu Ala Leu He Gly 

35 40 45 

Glu Ser Thr Asp He Val Thr Ser Glu Leu Arg Val Arg Gin Arg Gly 
50 55 60 

35 He Ala Glu He Thr Glu Met He His Val Ala Ser Leu Leu His Asp 
65 70 75 so 

Asp Val Leu Asp Asp Ala Asp Thr Arg Arg Gly Val Gly Ser Leu Asn 

85 go 95 

Val Val Met Gly Asn Lys Met Ser Val Leu Ala Gly Asp Phe Leu Leu 



180 
240 



<400> 11 

atggtacttg ccgaggttcc aaagcttgcc tctgctgctg agtacttctt caaaaggggt so 

gtgcaaggaa aacagtttcg ttcaactatt ttgctgctga tggcgacagc tctgaatgta 120 
cgcgttccag aagcattgat tggggaatca acagatatag tcacatcaga attacgcgta 
aggcaacggg gtattgctga aatcactgaa atgatacacg tcgcaagtct actgcacgat 

gatgtcttgg atgatgccga tacaaggcgt ggtgttggtt ccttaaatgt tgtaatgggt 300 

aacaagatgt cggtattagc aggagacttc ttgctctccc gggcttgtgg ggctctcgct 360 

gctttaaaga acacagaggt tgtagcatta cttgcaactg ctgtagaaca tcttgttacc 420 

ggtgaaacca tggaaataac tagttcaacc gagcagcgtt atagtatgga ctactacatg 480 

cagaagacat attataagac agcatcgcta atctctaaca gctgcaaagc tgttgccgtt 540 

ctcactggac aaacagcaga agttgccgtg ttagcttttg agtatgggag gaatctgggt 600 

ttagcattcc aattaataga cgacattctt gatttcacgg gcacatctgc ctctctcgga 660 

aagggatcgt tgtcagatat tcgccatgga gtcataacag ccccaatcct ctttgccatg 720 

gaagagtttc ctcaactacg cgaagttgtt gatcaagttg aaaaagatcc taggaatgtt 780 
gacattgctt tagagtatct tgggaagagc aagggaatac agagggcaag agaattagcc 
atggaacatg cgaatctagc agcagctgca atcgggtctc tacctgaaac agacaatgaa 

gatgtcaaaa gatcgaggcg ggcacttatt gacttgaccc atagagtcat caccagaaac 960 
aagtga 



840 
900 



966 
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100 ios 110 , 



Ser Arg Ala Cys Gly Ala Leu Ala Ala Leu Lys Asn Thr - ' Glu Val Val 

115 MO 125 

Ala Leu Leu Ala Thr Ala Val Glu His Leu Val Thr Gly Glu Thr Met 
5 130 "5 140 

Glu He Thr Ser Ser Thr Glu Gin Arg Tyr Ser Met Asp Tyr Tyr Met 
145 150 155 1S0 

Gin Lys Thr Tyr Tyr Lys Thr Ala Ser Leu He Ser Asn Ser Cys Lys 

165 170 1?5 

0 Ala Val Ala Val Leu Thr Gly Gin Thr Ala Glu Val Ala Val Leu Ala 

180 185 190 

Phe Glu Tyr Gly Arg Asn Leu Gly Leu Ala Phe Gin Leu He Asp Asp 

195 200 . 205 

He Leu Asp Phe Thr Gly Thr Ser Ala Ser Leu Gly Lys Gly Ser Leu 
5 210 215 220 

Ser Asp lie Arg His Gly Val He Thr Ala Pro He Leu Phe Ala Met 

225 230 235 240 

Glu Glu Phe Pro Gin Leu Arg Glu Val Val Asp Gin Val Glu Lys Asp 



245 250 



255 



) Pro Arg Asn Val Asp He Ala Leu Glu Tyr Leu Gly Lys Ser Lys Gly 



260 26 5 



270 

He Gin Arg Ala Arg Glu Leu Ala Met Glu His Ala Asn Leu Ala Ala 



275 280 



285 

Ala Ala He Gly Ser Leu Pro Glu Thr Asp Asn Glu Asp Val Lys Arg 

290 295 300 

Ser Arg Arg Ala Leu He Asp Leu Thr His Arg Val He Thr Arg Asn 

305 310 315 3 2 o 

Lys 



<210> 13 
<211> 621 
<212> DNA 

<213> Arabidopsis sp 
<400> 13 



gctttctcct ttgctaattc ttgagctttc ttgatcccac cgcgatttct aactatttca 
atcgcttctt caagcgatcc aggctcacaa aactcagact caatgatctc tcttagcctt 
ggctcattct ctagcgcgaa gatcactggc gccgttatgt tacctttggc taagtcatta 



60 
120 
180 



12 



WO 01/79472 



PCT/US01/12334 



gctgcaggct tacctaactg ctctgtggac tgagtgaagt ccagaatgtc atcaactact 240 

tgaaaagata aaccgagatt cttcccgaac tgatacattt gctctgcgac cttgctttcg 300 
actttactga aaattgctgc tcctttggtg cttgcagcta ctaatgaagc tgtcttgtag ' 360 

taactcttta gcatgtagtc atcaagcttg acatcacaat cgaataaact cgatgcttgc 420 

tttatctcac cgcttgcaaa atctttgatc acctgcaaaa agataaatca agattcagac 48 o 

caaatgttct ttgtattgag tagcttcatc taatctcaga aaggaatatt acctgactta 540 

tgagcttaat gacttcaagg ttttcgagat ttgtaagtac catgatgctt gagcaacatg 600 
aaatccccag ctaatacagc t 



<210> 14 
<211> 741 
<212> DNA 

<213> Arabidopsis sp 



621 



<400> 14 








ggtgagtttt 


gttaatagtt atgagattca tctatttttg tcataaaatt 


gtttggtttg 


60 


gtttaaactc 


tgtgtataat tgcaggaaag gaaacagttc atgagctttt 


cggcacaaga 


120 


gtagcggtgc 


tagctggaga tttcatgttt gctcaagcgt catggtactt 


agcaaatctc 


180 


gagaatcttg 


aagttattaa gctcatcagt caggtactta gttactctta 


cattgttttt 


240 


ctatgaggtt 


gagctatgaa tctcatttcg ttgaataatg ctgtgcctca 


aacttttttt 


300 


catgttttca 


ggtgatcaaa gactttgcaa gcggagagat aaagcaggcg 


tccagcttat 


360 


ttgactgcga 


caccaagctc gacgagtact tactcaaaag tttctacaag 


acagcctctt 


420 


tagtggctgc 


gagcaccaaa ggagctgcca ttttcagcag agttgagcct gatgtgacag 


480 


aacaaatgta 


cgagtttggg aagaatctcg gtctctcttt ccagatagtt 


gatgatattt 


540 


tggatttcac 


tcagtcgaca gagcagctcg ggaagccagc agggagtgat 


ttggctaaag 


600 


gtaacttaac 


agcacctgtg attttcgctc tggagaggga gccaaggcta 


agagagatca 


660 


ttgagtcaaa 


gttctgtgag gcgggttctc tggaagaagc gattgaagcg gtgacaaaag 


720 


gtggggggat 


taagagagca c 




741 



<210> 15 
<211> 1087 
<212> DNA 

<213> Arabidopsis sp 



60 



<400> 15 

cctcttcagc caatccagag gaagaagaga caacttttta tctttcgtca agagtctccg 
aaaacgcacg gttttatgct ctctcttctg ccctcacctc acaagacgca gggcacatga 120 
ttcaaccaga gggaaaaagc aacgataaca actctgcttt tgatttcaag ctgtatatga 180 
tccgcaaagc cgagtctgta aatgcggctc tcgacgtttc cgtaccgctt ctgaaacccc 240 
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ttacgatcca agaagcggtc aggtactctt tgctagccgg cggaaaacgt gtgaggcctc 
tgctotgcat tgccgottgt gagcttgtgg ggggcgacga ggctactgcc atgtcagccg 
cttgcgcggt cgagatgatc cacacaagct ctotoattca tgacgatott ccgtgcatgg 
acaatgccga cctccgtaga ggcaagccca ccaatcacaa ggtatgttgt ttaattatat 
gaaggctcag agataatgct gaactagtgt tgaaccaatt tttgctcaaa caaggtatat 
ggagaagaca tggcggtttt ggcaggtgat gcactccttg cattggcgtt tgagcacatg 
acggttgtgt cgagtgggtt ggtcgctccc gagaagatga ttcgcgccgt ggttgagctg 
gccagggcca tagggactac agggctagtt gctggacaaa tgatagacct agccagcgaa 
agactgaatc cagacaaggt tggattggag catctagagt tcatccatct ccacaaaacg 
gcggcattgt tggaggcagc ggcagtttta ggggttataa tgggaggtgg aacagaggaa " 
gaaatcgaaa agcttagaaa gtatgctagg tgtattggac tactgtttca ggttgttgat 
gacattctcg acgtaacaaa atctactgag gaattgggta agacagccgg aaaagacgta 
atggccggaa agctgacgta tccaaggctg ataggtttgg agggatccag ggaagttgca 
gagcacctga ggagagaagc agaggaaaag cttaaagggt ttgatccaag tcaggcggcg 
15 cctctgg 



10 



20 



25 



30 



35 



<210> 16 
<211> 1164 
<212> DNA 

<213> Arabidopsis sp 
<400> 16 



atgacttcga ttctcaacac tgtctccacc atccactctt ccagagttac • ctccgtcgat 
cgagtcggag tcctctctct tcggaattcg gattccgttg agttcactcg ccggcgttct 
ggtttctcga cgttgatcta cgaatcaccc gggcggagat ttgttgtgcg tgcggcggag 
actgatactg ataaagttaa atctcagaca cctgacaagg caccagccgg tggttcaagc 
attaaccagc ttctcggtat caaaggagca tctcaagaaa ctaataaatg gaagattcgt 
cttcagctta caaaaccagt cacttggcct ccactggttt ggggagtcgt ctgtggtgct 
gctgcttcag ggaactttca ttggacccca gaggatgttg ctaagtcgat tctttgcatg 
atgatgtctg gtccttgtct tactggctat acacagacaa tcaacgactg gtatgataga 
gatatcgacg caattaatga gccatatcgt ccaattccat ctggagcaat atcagagcca 
gaggttatta cacaagtctg ggtgctatta ttgggaggtc ttggtattgc tggaatatta 
gatgtgtggg cagggcatac cactcccact gtcttctatc ttgctttggg aggatcattg 
ctatcttata tatactctgc tccacctctt aagctaaaac aaaatggatg ggttggaaat 
tttgcacttg gagcaagcta tattagtttg ccatggtggg ctggccaagc attgtttggc 
actcttacgc cagatgttgt tgttctaaca ctcttgtaca gcatagctgg gttaggaata 
gccattgtta acgacttcaa aagtgttgaa ggagatagag cattaggact tcagtctctc 
ccagtagctt ttggcaccga aactgcaaaa tggatatgcg ttggtgctat agacattact 
cagctttctg ttgccggata tctattagca tctgggaaac cttattatgc gttggcgttg 



300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1087 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
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gttgctttga tcattcctca gattgtgttc cagtttaaat actttctcaa -ggaccctgtc iobo 
aaatacgacg tcaagtacca ggcaagcgcg cagccattct tggtgctcgg aatatttgta 1140 
acggcattag catcgcaaca ctga 



<210> 17 
<211> 387 
<212> PRT 

<213> Arabidopsis sp 
<400> 17 

Met Thr Ser He Leu Asn Thr Val Ser Thr He His Ser Ser Arg Val 

Thr Ser Val Asp Arg Val Gly Val Leu Ser Leu Arg Asn Ser Asp Ser 

20 25 30 

Val Glu Phe Thr Arg Arg Arg Ser Gly Phe Ser Thr Leu lie Tyr Glu 

35 40 45 

Ser Pro Gly Arg Arg Phe Val Val Arg Ala Ala Glu Thr Asp Thr Asp 

50 55 go 

Lys Val Lys Ser Gin Thr Pro Asp Lys Ala Pro Ala Gly Gly Ser Ser 
65 70 75 80 

He Asn Gin Leu Leu Gly He Lys Gly Ala Ser Gin Glu Thr Asn Lys 

85 go 9s 

Trp Lys He Arg Leu Gin Leu Thr Lys Pro Val Thr Trp Pro Pro Leu 

100 105 no 

Val Trp Gly Val Val Cys Gly Ala Ala Ala Ser Gly Asn Phe His Trp 

115 120 X25 

Thr Pro Glu Asp Val Ala Lys Ser He Leu Cys Met Met Met Ser Gly 

130 135 140 

Pro Cys Leu Thr Gly Tyr Thr Gin Thr He Asn Asp Trp Tyr Asp Arg 

145 iso ncc 

i3U 155 X60 

Asp lie Asp Ala lie Asn Glu Pro Tyr Arg Pro lie Pro Ser Gly Ala 

165 170 175 

He Ser Glu Pro Glu Val He Thr Gin Val Trp Val Leu Leu Leu Gly 

180 185 190 

Gly Leu Gly He Ala Gly He Leu Asp Val Trp Ala Gly His Thr Thr 

195 200 205 

Pro Thr Val Phe Tyr Leu Ala Leu Gly Gly Ser Leu Leu Ser Tyr He 

210 215 220 

Tyr Ser Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Val Gly Asn 



1164 
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225 230 235 • 240 

Phe Ala Leu Gly Ala Ser Tyr lie Ser Leu Pro Trp Trp Ala Gly Gin 

245 250 255 

Ala Leu Phe Gly Thr Leu Thr Pro Asp Val Val Val Leu Thr Leu Leu 
5 260 265 270 

Tyr Ser He Ala Gly Leu Gly He Ala He Val Asn Asp Phe Lys Ser 

275 280 . 285 

Val Glu Gly Asp Arg Ala Leu Gly Leu Gin Ser Leu Pro Val Ala Phe 
290 295 300 

10 Gly Thr Glu Thr Ala Lys Trp He Cys Val Gly Ala He Asp He Thr 

305 310 315 320 

Gin Leu Ser Val Ala Gly Tyr Leu Leu Ala Ser Gly Lys Pro Tyr Tyr 

325 330 335 

Ala Leu Ala Leu Val Ala Leu He He Pro Gin He Val Phe Gin Phe 
15 340 345 350 

Lys Tyr Phe Leu Lys Asp Pro Val Lys Tyr Asp Val Lys Tyr Gin Ala 

355 360 3 65 

Ser Ala Gin Pro Phe Leu Val Leu Gly He Phe Val Thr Ala Leu Ala 
370 375 380 

20 Ser Gin His 
385 



<210> 18 
<211> 981 
25 <212> DNA 

<213> Arabidopsis sp 

<400> 18 

atgttgttta gtggttcagc gatccoatta agcagcttct gctctcttcc ggagaaaccc 60 

cacactcttc ctatgaaact ctctcccgct gcaatccgat cttoatcctc atctgccccg 120 

gggtcgttga acttcgatct gaggacgtat tggacgactc tgatcaccga gatcaaccag ibo 

aagctggatg aggccatacc ggtcaagcac cctgcgggga totacgagge tatgagatac 240 

tctgtactcg cacaaggcgc caagcgtgcc cctcctgtga tgtgtgtggc ggcctgcgag 300 

ctcttcggtg gcgatcgcct cgccgctttc cccaccgcct gtgccctaga aatggtgcac 360 

gcggottcgt tgatacacga cgacctcccc tgtatggacg acgatcctgt gcgcagagga 420 

aagocatcta accacactgt ctacggctct ggcatggcca ttctcgocgg tgacgccctc 480 

ttcccactcg ccttccagca cattgtctcc cacacgoctc ctgaccttgt tcccogagcc 54 o 

accatcctca gactcatcac tgagattgcc cgcactgtcg gctccactgg tatggctgca 600 

ggccagtacg tcgaccttga aggaggtccc tttcctcttt cctttgttca ggagaagaaa 660 



30 



35 
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ttcggagcca tgggtgaatg ctctgccgtg tgcggtggcc tattgggcgg tgccactgag 720 
gatgagctco agagtctccg aaggtacggg agagccgtcg ggatgctgta tcaggtggtc 780 
gatgaoatca ccgaggacaa gaagaagagc tatgatggtg gagcagagaa gggaatgatg 
gaaatggcgg aagagctcaa ggagaaggcg aagaaggagc ttcaagtgtt tgacaacaag 
tatggaggag gagacacact tgttcctctc tacaccttcg ttgactacgc tgctcatcga 
cattttcttc ttcccctctg a 



<210> 19 
<211> 245 
0 <212> DNA 

<213> GLycine sp 

<400> 19 



gcaacatctg ggactgggtt tgtcttgggg agtggtagtg ctgttgatct ttcggcactt 
5 tcttgcactt gcttgggtac catgatggtt gctgcatctg ctaactcttt gaatcaggtg 
tttgagatca ataatgatgc taaaatgaag agaacaagtc gcaggccact accctcagga 
cgcatcacaa tacctcatgc agttggctgg gcatcctctg ttggattagc tggtacggct 



ctact 

) <210> 20 
<211> 253 
<212> DNA 
<213> Glycine sp 



<400> 20 

attggctttc caagatcatt gggttttctt gttgcattca tgaccttcta ctccttgggt 
ttggcattgt ccaaggatat acctgacgtt gaaggagata aagagcacgg cattgattct 
tttgcagtac gtctaggtca gaaacgggca ttttggattt gcgtttcctt ttttgaaatg 
gctttcggag ttggtatcct ggccggagca tcatgctcac acttttggac taaaattttc 
ac 999tatgg gaa 



<210> 21 
<211> 275 
<212> DNA 
<213> Glycine sp 

<400> 21 



tgatcttcta ctctctgggt atggcattgt ccaaggatat atctgacgtt aaaggagata 
aagcatacgg catcgatact ttagcgatac gtttgggtca aaaatgggta ttttggattt 



840 
900 
960 
981 



60 
120 
180 
240 
245 



60 
120 
180 
240 
253 



60 
120 
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gcattatcct ttttgaaatg gcttttggag ttgccctctt ggcaggagca acatcttctt 
acctttggat taaaattgtc acgggtctgg gacatgctat tcttgcttca attctcttgt 
accaagccaa atctatatac ttgagcaaca aagtt 

5 <210> 22 
<211> 299 
<212> DNA 
<213> Glycine sp 



10 <220> 

<221> misc_f eature 
<222> (1) . . . (299) 
<223> n = A,T,C or G 



15 <400> 22 



<210> 23 
<211> 767 
<212> DNA 
25 <213> Glycine sp 



60 



ccanaatang tncatcttng aaagacaatt ggcctcttca acacacaagt ctgcatgtga 
agaagaggcc aattgtcttt ccaagatcac ttatngtggc tattgtaatc atgaacttct 120 
tctttgtggg tatggcattg gcaaaggata- tacctanctg ttgaaggaga. taaaatatat 180 
ggcattgata cttttgcaat acgtataggt caaaaacaag tattttggat ttgtattttc 
20 ctttttgaaa ggctttcgga gtttccctag tggcaggagc aacatcttct agccttggt 



240 
299 



<400> 23 



gtggaggctg tggttgctgc cctgtttatg aatatttata ttgttggttt gaatcaattg 
tctgatgttg aaatagacaa gataaacaag ccgtatcttc cattagcatc tggggaatat 
tcctttgaaa ctggtgtcac tattgttgca tctttttcaa ttctgagttt ttggcttggc 
tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 
gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgatg 
tgcattctag ctgttcgggc agtaatagtt caacttgcat ttttccttca catgcagact 
catgtgtaca agaggccacc tgtcttttca agaccattga tttttgctac tgcattcatg 
agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 480 
gtatttggca tccaatcttt ttcagtgtgt ttaggtcaga agccggtgtt ctggacttgt 540 
gttacccttc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 600 
ctttggagca aaattttcac gggtctggga cacgctgtgc tggcttcaat tctctggttt 660 
catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt 720 



60 
120 
180 
240 
300 
360 
420 
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tggaagctat tttatgcaga atacttactc attccttttg ttagatg 



<210> 24 
<211> 255 
<212> PET 
<213> Glycine sp 



<400> 24 

Val Glu Ala Val Val Ala Ala Leu Phe Met Asn lie Tyr lie Val Gly 

1 5 io 15 

Leu Asn Gin Leu Ser Asp Val Glu lie Asp Lys He Asn Lys Pro Tyr 

20 25 30 

Leu Pro Leu Ala Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He 

35 40 45 

Val Ala Ser Phe Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly 

50 55 go 

Ser Trp Pro Leu Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr 

65 70 ne 

/u 75 go 

Ala Tyr Ser lie Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val 

85 90 9s 

Leu Ala Ala Met Cys lie Leu Ala Val Arg Ala Val He Val Gin Leu 

100 105 110 

Ala Phe Phe Leu His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val 

115 120 125 

Phe Ser Arg Pro Leu He Phe Ala Thr Ala Phe Met Ser Phe Phe Ser 

130 135 140 

Val Val He Ala Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys 

145 15 ° 155 iso 

Val Phe Gly He Gin Ser Phe Ser Val Cys Leu Gly Gin Lys Pro Val 

165 170 175 

Phe Trp Thr Cys Val Thr Leu Leu Glu He Ala Tyr Gly Val Ala Leu 

180 185 igo 

Leu Val Gly Ala Ala Ser Pro Cys Leu Trp Ser Lys He Phe Thr Gly 

195 200 20 5 

Leu Gly His Ala Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser 

210 215 220 

Val Asp Leu Lys Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe He 
225 23 <> 235 240 

Trp Lys Leu Phe Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 
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255 



<210> 25 
<211> 360 
<212> DNA 
<213> Zea sp 



<220> 

<22l> misc_f eature 
<222> (1) . . . (360) 
<223> n m A,T,C or G 



<400> 25 

ggcgtcttca cttgfctctgg tcttctcgta tcccctgatg aagaggttca cattttggcc 
tcaggcttat cttggcctga cattcaactg gggagcttta ctagggtggg ctgctattaa 
ggaaagcata gaccctgcaa atcatccttc cattgtatac agctggtatt tgttggacgc 
tggtgtatga tactatatat gcgcatcagg tgtttcgcta tccctacttt catattaatc 
cttgatgaag tggccatttc atgttgtcgc ggtggtctta tacttgcata tctccatgca 
tctcaggaca aagangatga cctgaaagta ggagtccaag tccacagctt aagatttggg 

<210> 26 
<211> 299 
<212> DNA 
<213> Zea sp 



<220> 

<221> mis cofeature 
<222> (1) . . . (299) 
<223> n = A,T,C or G 



<400> 26 

gatggttgca gcatctgcaa ataccctcaa ccaggtgttt gngataaaaa atgatgctaa 60 
aatgaaaagg acaatgcgtg ccccctgcca tctggtcgca ttagtcctgc acatgctgcg 120 
atgtgggcta caagtgttgg agttgcagga acagctttgt tggcctggaa ggctaatggc 
ttggcagctg ggcttgcagc ttctaatctt gttctgtatg catttgtgta tacgccgttg 
aagcaaatac accctgttaa tacatgggtt ggggcagtcg ttggtgccat cccaccact 



180 
240 
299 



<210> 27 
<211> 255 
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<212> DNA 
<213> Zea sp 



<220> 

5 <221> misc_j£eature 
<222> (1) . . . (255) 
<223> n = A, T, C or G 



<400> 27 



0 



5 



anacttgcat atctccatgc ntctcaggac aaagangatg acctgaaagt aggtgtcaag 
tccacagcat taagatttgg agatttgacc nnatactgna tcagtggctt tggcgcggca 
tgcttcggca gcttagcact cagtggttac aatgctgacc ttggttggtg tttagtgtga 
tgcttgagcg aagaatggta tngtttttac ttgatattga ctccagacct gaaatcatgt 
tggacagggt ggccc 



<210> 28 
<211> 257 
<212> DNA 
<213> Zea sp 

<400> 28 



attgaagggg ataggactct ggggcttcag tcacttcctg ttgcttttgg gatggaaact 
gcaaaatgga tttgtgttgg agcaattgat atcactcaat tatctgttgc aggttaccta 
ttgagcaccg gtaagctgta ttatgccctg gtgttgcttg ggctaacaat tcctcaggtg 
ttctttcagt tccagtactt cctgaaggac cctgtgaagt atgatgtcaa atatcaggca 
agcgcacaac cattctt 



<210> 29 
<211> 368 
<212> DNA 
<213> Zea sp 

<400> 29 



atccagttgc aaataataat ggcgttcttc tctgttgtaa tagcactatt caaggatata 
cctgacatcg aaggggaccg catattcggg atccgatcct tcagcgtccg gttagggcaa 
aagaaggtct tttggatctg cgttggcttg cttgagatgg cctacagcgt tgcgatactg 
atgggagcta cctcttcctg tttgtggagc aaaacagcaa ccatcgctgg ccattccata 
cttgccgcga tcctatggag ctgcgcgcga tcggtggact tgacgagcaa agccgcaata 
acgtccttct acatgttcat ctggaagctg ttctacgcgg agtacctgct catccctctg 



60 
120 
180 
240 
255 



60 
120 
180 
240 
257 



60 
120 
180 
240 
300 
360 
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gtgcggtg 

<210> 30 
<211> 122 
5 <212> PRT 

<213> Zea sp 

<400> 30 

lie Gin Leu Gin lie lie Met Ala Phe Phe Ser Val Val He Ala Leu 
10 1 5 10 15 

Phe Lys Asp He Pro Asp He Glu Gly Asp Arg He Phe Gly He Arg 

20 25 30 

Ser Phe Ser Val Arg Leu Gly Gin Lys Lys Val Phe Trp He Cys Val 
35 40 45 

15 Gly Leu Leu Glu Met Ala Tyr Ser Val Ala He Leu Met Gly Ala Thr 
50 55 so 

Ser Ser Cys Leu Trp Ser Lys Thr Ala Thr He Ala Gly His Ser He 
65 70 75 80 

Leu Ala Ala He Leu Trp Ser Cys Ala Arg Ser Val Asp Leu Thr Ser 
20 85 90 95 

Lys Ala Ala He Thr Ser Phe Tyr Met Phe He Trp Lys Leu Phe Tyr 

100 105 no 

Ala Glu Tyr Leu Leu He Pro Leu Val Arg 

115 120 



368 



25 



30 



35 



<210> 31 
<211> 278 
<212> DNA 
<213> Zea sp 

<400> 31 

tattcagcac cacctctcaa gctcaagcag aatggatgga ttgggaactt cgctctgggt 60 

gcgagttaca tcagcttgcc ctggtgggct ggccaggcgt tatttggaac tcttacacca 120 

gatatcattg tcttgactac tttgtacagc atagctgggc tagggattgc tattgtaaat 180 

gatttcaaga gtattgaagg ggataggact ctggggcttc agtcacttcc tgttgctttt 240 

gggatggaaa ctgcaaaatg gatttgtgtt ggagcaat 278 

<210> 32 
<211> 292 
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<212> PRT 

<213> Synechocystis sp 
<400> 32 

Met Val Ala Gin Thr Pro Ser Ser Pro Pro Leu Trp Leu Thr He He 

15 10 15 

Tyr Leu Leu Arg Trp His Lys Pro Ala Gly Arg Leu He Leu Met He 

20 25 30 

Pro Ala Leu Trp Ala Val Cys Leu Ala Ala Gin Gly Leu Pro Pro Leu 

35 40 45 

Pro Leu Leu Gly Thr He Ala Leu Gly Thr Leu Ala Thr Ser Gly Leu 

50 55 go 

Gly Cys Val Val Asn Asp Leu Trp Asp Arg Asp He Asp Pro Gin Val 
65 70 75 80 

Glu Arg Thr Lys Gin Arg Pro Leu Ala Ala Arg Ala Leu Ser Val Gin 

85 90 95 

Val Gly He Gly Val Ala Leu Val Ala Leu Leu Cys Ala Ala Gly Leu 

100 105 no 

Ala Phe Tyr Leu Thr Pro Leu Ser Phe Trp Leu Cys Val Ala- Ala Val 

115 120 125 

Pro Val He Val Ala Tyr Pro Gly Ala Lys Arg Val Phe Pro Val Pro 

13 0 135 140 

Gin Leu Val Leu Ser He Ala Trp Gly Phe Ala Val Leu He Ser Trp 
145 150 155 iso 

Ser Ala Val Thr Gly Asp Leu Thr Asp Ala Thr Trp Val Leu Trp Gly 

165 i7p 175 

Ala Thr Val Phe Trp Thr Leu Gly Phe Asp Thr Val Tyr Ala Met Ala 

180 185 190 

Asp Arg Glu Asp Asp Arg Arg He Gly Val Asn Ser Ser Ala Leu Phe 

195 200 205 

Phe Gly Gin Tyr Val Gly Glu Ala Val Gly He Phe Phe Ala Leu Thr 

210 215 220 

He Gly Cys Leu Phe Tyr Leu Gly Met He Leu Met Leu Asn Pro Leu 
225 230 235 240 

Tyr Trp Leu Ser Leu Ala He Ala He Val Gly Trp Val He Gin Tyr 

245 250 255 

He Gin Leu Ser Ala Pro Thr Pro Glu 'Pro Lys Leu Tyr Gly Gin He 

260 265 270 

Phe Gly Gin Asn Val He He Gly Phe Val Leu Leu Ala Gly Met Leu 
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Leu Gly Trp Leu 
290 



280 
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285 



5 <210> 33 
<2ll> 316 
<212> PRT 

<213> Synechocystis sp 



) 65 



0 <400> 33 

Met Val Thr Ser Thr Lys He His Arg Gin His Asp Ser Met Gly Ala 

1 5 10 15 

Val Cys Lys Ser Tyr Tyr Gin Leu Thr Lys Pro Arg lie He Pro Leu 

20 25 3 0 

5 Leu Leu He Thr Thr Ala Ala Ser Met Trp Xle Ala Ser Glu Gly Arg 

35 40 45 

Val Asp Leu Pro Lys Leu Leu He Thr Leu Leu Gly Gly Thr Leu Ala 

50 55 so 

Ala- Ala Ser Ala Gin Thr Leu Asn Cys He Tyr Asp Gin Asp He Asp 

70 75 80 

Tyr Glu Met Leu Arg Thr Arg Ala Arg Pro He Pro Ala Gly Lys Val 

85 90 95 

Gin Pro Arg His Ala Leu He Phe Ala Leu Ala Leu Gly Val Leu Ser 

100 105 110 

Phe Ala Leu Leu Ala Thr Phe Val Asn Val Leu Ser Gly Cys Leu Ala 

115 120 125 

Leu Ser Gly He Val Phe Tyr Met Leu Val Tyr Thr His Trp Leu Lys 

130 135 140 

Arg His Thr Ala Gin Asn He Val He Gly Gly Ala Ala Gly Ser II 
145 i 5 o 155 

Pro Pro Leu Val Gly Trp Ala Ala Val Thr Gly Asp Leu Ser Trp Thr 

XS5 170 175 

Pro Trp Val Leu Phe Ala Leu He Phe Leu Trp Thr Pro Pro His Phe 

180 185 190 

Trp Ala Leu Ala Leu Met He Lys Asp Asp Tyr Ala Gin Val Asn Val 

195 200 205 

Pro Met Leu Pro Val He Ala Gly Glu Glu Lys Thr Val Ser Gin He 

210 215 220 

Trp Tyr Tyr Ser Leu Leu Val Val Pro Phe Ser Leu Leu Leu Val Tyr 



,e 
160 
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225 230 235 24 o 

Pro Leu His Gin Leu Gly lie Leu Tyr Leu Ala He Ala He He 



245 250 



Leu 

255 

Gly Gly Gin Phe Leu Val Lys Ala Trp Gin Leu Lys Gin Ala Pro Gly 
5 260 265 270 

Asp Arg Asp Leu Ala Arg Gly Leu Phe Lys Phe Ser He Phe Tyr Leu 

275 280 285 

Met Leu Leu Cys Leu Ala Met Val He Asp Ser Leu Pro Val Thr His 
290 295 300 

10 Gin Leu Val Ala Gin Met Gly Thr Leu Leu Leu Gly 
305 3io 315 

<210> 34 
<211> 324 
15 <212> PRT 

<213> Synechocystis sp 

<400> 34 

Met Ser Asp Thr Gin Asn Thr Gly Gin Asn Gin Ala Lys Ala Arg Gin 

Leu Leu Gly Met Lys Gly Ala Ala Pro Gly Glu Ser Ser He Trp Lys 

20 25 30 

He Arg Leu Gin Leu Met Lys Pro He Thr Trp He Pro Leu He Trp 
35 40 45 

25 Gly Val Val Cys Gly Ala Ala Ser Ser Gly Gly Tyr He Trp Ser Val 
50 55 so 

Glu Asp Phe Leu Lys Ala Leu Thr Cys Met Leu Leu Ser Gly Pro Leu 

55 70 75 80 

Met Thr Gly Tyr Thr Gin Thr Leu Asn Asp Phe Tyr Asp Arg Asp He 

° 85 90 95 

Asp Ala He Asn Glu Pro Tyr Arg Pro He Pro Ser Gly Ala He Ser 

100 105 1X0 

Val Pro Gin Val Val Thr Gin He Leu He Leu Leu Val Ala Gly He 
115 120 125 

5 Gly Val Ala Tyr Gly Leu Asp Val Trp Ala Gin His Asp Phe Pro lie 
130 135 140 

Met Met Val Leu Thr Leu Gly Gly Ala Phe Val Ala Tyr He Tyr Ser 

145 150 ' 155 160 

Ala Pro Pro Leu Lys Leu Lys Gin Asn Gly Trp Leu Gly Asn Tyr Ala 
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165 170 175 

Leu Gly Ala Ser Tyr He Ala Leu Pro Trp Trp Ala Gly His Ala Leu 

180 "5 190 

Phe Gly Thr Leu Asn Pro Thr He Met Val Leu Thr Leu lie Tyr Ser 
5 195 200 205 

Leu Ala Gly Leu Gly He Ala Val Val Asn Asp Phe Lys Ser Val Glu 

210 215 220 

Gly Asp Arg Gin Leu Gly Leu Lys Ser Leu Pro Val Met Phe Gly lie 
225 230 235 240 

10 Gly Thr Ala Ala Trp He Cys Val He Met He Asp Val Phe Gin Ala 

245 25° 255 

Gly He Ala Gly Tyr Leu He Tyr Val His Gin Gin Leu Tyr Ala Thr 

He val Leu Leu Leu Leu lie Pro Gin He Thr Phe Gin Asp Met Tyr ' 
15 275 280 285 

Phe Leu Arg Asn Pro Leu Glu Asn Asp Val Lys Tyr Gin Ala Ser Ala 
290 295 



300 

Gin Pro Phe Leu Val Phe Gly Met Leu Ala Thr Gly Leu Ala Leu Gly 
305 310 



20 His Ala Gly II 



315 320 



e 



<210> 35 
<211> 307 
25 <212> PRT 

<213> Synechocystis sp 

<400> 35 

Met Thr Glu Ser Ser Pro Leu Ala Pro Ser Thr Ala Pro Ala Thr Ara 

5 10 15 

Lys Leu Trp Leu Ala Ala He Lys Pro Pro Met Tyr Thr Val Ala Val 

20 25 30 

Val Pro lie Thr Val Gly Ser Ala Val Ala Tyr Gly Leu Thr Gly Gin 

35 40 45 

35 Trp His Gly Asp Val Phe Thr He Phe Leu Leu Ser Ala He Ala He 



50 



55 



60 



He Ala Trp He Asn Leu Ser Asn Asp Val Phe Asp Ser Asp Thr Gly 

65 7 0 7* 

75 80 

Xle Asp val Arg Lys Ala His Ser Val Val Asn Leu Thr Gly Asn Arg 
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85 90 95 

Asn Leu Val Phe Leu He Ser Asn Phe Phe Leu Leu Ala Gly Val Leu 

100 105 110 

Gly Leu Met Ser Met Ser Trp Arg Ala Gin Asp Trp Thr Val Leu Glu 
5 115 120 125 

Leu He Gly Val Ala He Phe Leu Gly Tyr Thr Tyr Gin Gly Pro Pro 

130 "5 140 

Phe Arg Leu Gly Tyr Leu Gly Leu Gly Glu Leu He Cys Leu He Thr 
145 150 155 160 

0 Phe Gly Pro Leu Ala He Ala Ala Ala Tyr Tyr Ser Gin Ser Gin Ser 

165 170 175 

Phe Ser Trp Asn Leu Leu Thr Pro Ser Val Phe Val Gly He Ser Thr 

180 185 X90 

Ala He He Leu Phe Cys Ser His Phe His Gin Val Glu Asp Asp Leu 
5 195 200 205 

Ala Ala Gly Lys Lys Ser Pro He Val Arg Leu Gly Thr Lys Leu Gly 

210 215 220 

Ser Gin Val Leu Thr Leu Ser Val Val Ser Leu Tyr Leu He Thr Ala 
225 230 235 240 

) He Gly Val Leu Cys His Gin Ala Pro Trp Gin Thr Leu Leu He lie 

245 250 255 

Ala Ser Leu Pro Trp Ala Val Gin Leu He Arg His Val Gly Gin Tyr 

260 265 270 

His Asp Gin Pro Glu Gin Val Ser Asn Cys Lys Phe He Ala Val Asn 

275 280 285 

Leu His Phe Phe Ser Gly Met Leu Met Ala Ala Gly Tyr Gly Trp Ala 

290 295 300 

Gly Leu Gly 

305 

<210> 36 
<211> 927 
<212> DNA 

<213> Synechocystis sp 



<400> 36 

atggcaacta tccaagcttt ttggegcttc 
ctgagcgtct gggctgtgta tctgttaact 
cctgcttccc tggatttagt gttcggcgct 



tcccgccccc ataccatcat tggtacaact 
attctcgggg atggaaactc agttaactcc 
tggctggcct gcctgttggg taatgtgtac 

27 



WO 01/79472 



PCT/US01/12334 



15 <210> 37 
<211> 308 
<212> PRT 

<213> Synechocystis sp 
20 <400> 37 

Met Ala Thr He Gin Ala Phe Trp Arg Phe Ser Arg Pro His Thr He 

1 5 io 15 

He Gly Thr Thr Leu Ser Val Trp Ala Val Tyr Leu Leu Thr lie Leu 

20 25 30 

25 Gly Asp Gly Asn Ser Val Asn Ser Pro Ala Ser Leu Asp Leu Val Phe 

35 40 45 

Gly Ala Trp Leu Ala Cys Leu Leu Gly Asn Val Tyr He Val Gly Leu 

50 55 60 

Asn Gin Leu Trp Asp Val Asp He Asp Arg He Asn Lys Pro Asn Leu 
30 65 70 75 80 

Pro Leu Ala Asn Gly Asp Phe Ser He Ala Gin Gly Arg Trp He Val 

85 90 95 

Gly Leu Cys Gly Val Ala Ser Leu Ala He Ala Trp Gly Leu Gly Leu 

100 105 no 

35 Trp Leu Gly Leu Thr Val Gly He Ser Leu He He Gly Thr Ala Tyr 

115 120 125 

Ser Val Pro Pro Val Arg Leu Lys Arg Phe Ser Leu Leu Ala Ala Leu 

130 135 140 

Cys He Leu Thr Val Arg Gly He Val Val Asn Leu Gly Leu Phe Leu 



attgtcggcc tcaaccaatt gtgggatgtg gacattgacc gcatcaataa gccgaatttg 240 

cccctagcta acggagattt ttctatcgcc cagggccgtt ggattgtggg actttgtggc 300 

gttgcttcct tggcgatcgc ctggggatta gggctatggc tggggctaac ggtgggcatt 360 

agtttgatta ttggcacggc ctattcggtg ccgccagtga ggttaaagcg cttttccctg 420 

5 ctggcggccc tgtgtattct gacggtgcgg ggaattgtgg ttaacttggg cttattttta 480 

ttttttagaa ttggtttagg ttatcccccc actttaataa cccccatctg ggttttgact 540 

ttatttatct tagttttcac cgtggcgatc gccattttta aagatgtgcc agatatggaa 600 

ggcgatcggc aatttaagat tcaaacttta actttgcaaa tcggcaaaca aaacgttttt 660 

cggggaacct taattttact cactggttgt tatttagcca tggcaatctg gggcttatgg 72 0 
10 gcggctatgc ctttaaatac tgctttcttg attgtttccc atttgtgctt attagcctta 



780 



ctctggtggc ggagtcgaga tgtacactta gaaagcaaaa ccgaaattgc tagtttttat 840 
cagtttattt ggaagctatt tttcttagag tacttgctgt atcccttggc tctgtggtta 
■cctaattttt ctaatactat tttttag 



900 
927 



28 
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145 150 155 16Q 

Phe Phe Arg lie Gly Leu Gly Tyr Pro Pro Thr Leu He Thr Pro He 

165 170 175 

Trp Val Leu Thr Leu Phe He Leu Val Phe Thr Val Ala He Ala He 

180 185 igo 

Phe Lys Asp Val Pro Asp Met Glu Gly Asp Arg Gin Phe Lys He Gin 

195 200 205 

Thr Leu Thr Leu Gin He Gly Lys Gin Asn Val Phe Arg Gly Thr Leu 

2X0 215 220 

He Leu Leu Thr Gly Cys Tyr Leu Ala Met Ala He Trp Gly Leu Trp 
225 230 235 24Q 

Ala Ala Met Pro Leu Asn Thr Ala Phe Leu He Val Ser His Leu Cys 

245 250 255 

Leu Leu Ala Leu Leu Trp Trp Arg Ser Arg Asp Val His Leu Glu Ser 

260 265 270 

Lys Thr Glu He Ala Ser Phe Tyr Gin Phe He Trp Lys Leu Phe Phe 

275 280 285 

Leu Glu Tyr Leu Leu Tyr Pro Leu Ala Leu Trp Leu Pro Asn Phe Ser 

290 295 
Asn Thr He Phe 
305 



300 



<210> 38 
<211> 1092 
<212> DNA 

<213> Synechocystis sp 



<400> 38 

atgaaatttc cgccccacag tggttaccat tggcaaggtc aatcaccttt ctttgaaggt 60 

tggtacgtgc gcctgctttt gccccaatcc ggggaaagtt ttgcttttat gtactccatc 120 

gaaaatcctg ctagcgatca tcattacggc ggcggtgctg tgcaaatttt agggccggct 180 

acgaaaaaac aagaaaatca ggaagaccaa cttgtttggc ggacatttcc ctcggtaaaa 240 

aaattttggg ccagtcctcg ccagtttgcc ctagggcatt ggggaaaatg tagggataac 300 

aggcaggcga aacccctact ctccgaagaa ttttttgcca cggtcaagga aggttatcaa 360 

atccatcaaa atcagcacca aggacaaatc attcatggcg atcgccattg tcgttggcag 420 

ttcaccgtag aaccggaagt aacttggggg agtcctaacc gatttcctcg ggctacagcg 480 

ggttggcttt cctttttacc cttgtttgat cccggttggc aaattctttt agcccaaggt 5 40 
agagcgcacg gctggctgaa atggcagagg gaacagtatg aatttgacca cgccctagtt 



600 



tatgccgaaa aaaattgggg tcactccttt ccctcccgct ggttttggct ccaagcaaat 660 



29 
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tattttcetg accatccagg actgagcgto actgccgctg gcggggaacg gattgttctt 
ggtcgccccg aagaggtagc tttaattggc ttacatcacc aaggtaattt ttacgaattt 
ggcccgggcc atggcacagt cacttggcaa gtagctccct ggggccgttg gcaattaaaa 
gccagcaatg ataggtattg ggtcaagttg tccggaaaaa cagataaaaa aggcagttta 
5 gtccacactc ccaccgccca gggcttacaa ctcaactgcc gagataccac taggggctat 
ttgtatttgc aattgggatc tgtgggtcac ggcctgatag tgcaagggga aacggacacc 
gcggggctag aagttggagg tgattggggt ttaacagagg aaaatttgag caaaaaaaca 
gtgccattct ga 

10 <210> 39 
<211> 363 
<212> PRT 

<213> Synechocystis sp 
15 <400> 39 

Met Lys Phe Pro Pro His Ser Gly Tyr His Trp Gin Gly Gin Ser Pro 

15 10 15 

Phe Phe Glu Gly Trp Tyr Val Arg Leu Leu Leu Pro Gin Ser Gly Glu 



20 



25 



30 

s His 



20 Ser p he Ala Phe Met Tyr Ser He Glu Asn Pro Ala Ser Asp Hi 

Tyr Gly Gly Gly Ala Val Gin lie Leu Gly Pro Ala Thr Lys Lys Gin 

55 60 
Glu Asn Gin Glu Asp Gin Leu Val Trp Arg Thr Phe Pro Ser Val Lys 
25 65 



70 



75 



Lys Phe Trp Ala Ser Pro Arg Gin Phe Ala Leu Gly His Trp Gly 



80 



Lys 



85 9 0 95 



Cys Arg Asp Asn Arg Gin Ala Lys Pro Leu Leu Ser Glu Glu Phe Phe 



100 



105 



30 Ala Thr Val Lys Glu Gly Tyr Gin lie His Gin Asn Gin Hi! Gin Gly 

115 



120 125 



35 145 



Gin lie lie His Gly Asp Arg His Cys Arg Trp Gin Phe Thr Val Glu 

130 135 140 

Pro Glu Val Thr Trp Gly Ser Pro Asn Arg Phe Pro Arg Ala 



150 



155 



Gly Trp Leu Ser Phe Leu Pro Leu Phe Asp Pro Gly Trp Gin 



165 



170 



Leu Ala Gin Gly Arg Ala His Gly Trp Leu Lys Trp Gin Arg 

180 185 190 



Thr Ala 
160 
lie Leu 
175 

Glu Gin 



720 
780 
840 
900 
960 
1020 
1080 
1092 



30 



10 
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Tyr Glu Phe Asp His Ala Leu Val Tyr Ala Glu Lys Asn Trp Gly His 

-195 200 

ZUU 205 

Ser Phe Pro Ser Arg Trp Phe Trp Leu Gin Ala Asn Tyr Phe Pro Asp 

220 

5 His Pro Gly Leu Ser Val Thr Ala Ala Gly Gly Glu Arg He Val Leu 

225 230 

235 240 
Gly Arg Pro Glu Glu Val Ala Leu He Gly Leu His His Gin Gly Asn 

245 250 255 

Phe Tyr Glu Phe Gly Pro Gly His Gly Thr Val Thr Trp Gin Val Ala 



260 



265 



270 



Pro Trp Gly Arg Trp Gin Leu Lys Ala Ser Asn Asp Arg Tyr Trp Val 

275 280 285 

Lys Leu Ser Gly Lys Thr Asp Lys Lys Gly Ser Leu Val His Thr Pro 
290 300 
15 Thr Ala Gin Gly Leu Gin Leu Asn Cys Arg Asp Thr Thr Arg Gly Tyr 
3 °5 310 



315 320 



Leu Tyr Leu Gin Leu Gly Ser Val Gly His Gly Leu lie Val Gin Gly 



325 



330 



20 



Glu Thr Asp Thr Ala Gly Leu Glu Val Gly Gly Asp Trp Gly 
Glu Glu Asn Leu Ser Lys Lys Thr Val Pro Phe 



335 



355 360 



<210> 40 
25 <211> 56 
<212> DNA 

<213> Artificial Sequence 
<400> 40 



30 cgcgatttaa atggcgcgcc ctgcaggcgg ccgcctgcag ggcgcgccat ttaaat 

<210> 41 
<211> 32 

* 

<212> DNA 
35 <213> Artificial Sequence 

<400> 41 

tcgaggatcc gcggccgcaa gcttcctgca gg 



56 



32 



31 
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<210> 42 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<400> 42 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 43 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<400> 43 

tcgacctgca ggaagcttgc ggccgcggat cc 

<210> 44 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<400> 44 

tcgaggatcc gcggccgcaa gcttcctgca gg 

<210> 45 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<400> 45 

tcgaggatcc gcggccgcaa gcttcctgca ggagct 

<210> 46 
<211> 28 
<212> DNA 

<213> Artificial Sequence 



<400> 46 

cctgcaggaa gcttgcggcc gcggatcc 



28 
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<210> 47 
<211> 36 
<212> DNA 
5 <213> Artificial Sequence 

<400> 47 

tcgacctgca ggaagcttgc ggccgcggat ccagct 

0 <210> 48 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<400> 48 

ggatccgcgg ccgcaagctt cctgcagg 

<210> 49 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<400> 49 

gatcacctgc aggaagcttg cggccgcgga tccaatgca 

<210> 50 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<400> 50 

ttggatccgc ggccgcaagc ttcctgcagg t 

<210> 51 
<211> 41 
<212> DNA 

<213> Artificial Sequence 
<400> 51 



WO 01/79472 PCT/US01/12334 



ggatccgcgg ccgcacaatg gagtctctgc tctctagtto t 

<210> 52 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<400> 52 

ggatcctgca ggtcacttca aaaaaggtaa cagcaagt 

<210> 53 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<400> 53 

ggatccgcgg ccgcacaatg gcgttttttg ggctctcccg tgttt 

<210> 54 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<400> 54 

ggatcctgca ggttattgaa aacttcttcc aagtacaact 

<210> 55 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<400> 55 

ggatccgcgg ccgcacaatg tggcgaagat ctgttgtt 

<210> 56 
<211> 37 
<212> DNA 

<213> Artificial Sequence 



34 



WO 01/79472 
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<400> 56 

ggatcctgca ggtcatggag agtagaagga aggagct 

<210> 57 
<211> 50 
<212> DNA 

<213> Artificial Sequence 
<400> 57 

ggatccgcgg ccgcacaatg gtacttgccg aggttccaaa gcttgcctct 

<210> 58 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
'<400> 58 

ggatcctgca ggtcacttgt ttctggtgat gactctat 

<210> 59 
<211> 38 
<212> DNA 

<213> Artificial Sequence 
<400> 59 

ggatccgcgg ccgcacaatg acttcgattc tcaacact 

<210> 60 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<400> 60 

ggatcctgca ggtcagtgtt gcgatgctaa tgccgt 

<210> 61 
<211> 22 
<212> DNA 

<213> Artificial Sequence 



WO 01/79472 
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<400> 61 

taatgtgtac attgtcggcc tc 

5 <210> 62 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
10 <400> 62 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt ccacaattcc 

<210> 63 
<211> 22 
15 <212> DNA 

<213> Artificial Sequence 

<400> 63 

aggctaataa gcacaaatgg ga 



20 



25 



22 



ccgcaccgtc 60 



22 



<210> 64 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<400> 64 



ggtatgagtc agcaacaect tcttoacgag gcagacctca gcggaattgg tttaggttat 



ccc 

30 <210> 65 
<211> 26 
<212> DNA 

<213> Artificial Sequence 

35 <400> 65 

ggatccatgg ttgcccaaac cccatc 

<210> 66 
<211> 61 



60 
63 



26 



36 



WO 01/79472 
<212> DNA 

<213> Artificial Sequence 
<400> 66 

gcaatgtaac atcagagatt ttgagacaca 
c 

<210> 67 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<400> 67 

gaattctcaa agccagccca gtaac 

<210> 68 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<400> 68 

ggtatgagtc agcaacacct tcttcacgag 
ccc 

<210> 69 
<211> 23 
<212> DNA 

<213> Artificial Sequence 
<400> 69 

ccagtggttt aggctgtgtg gtc 

<210> 70 
<211> 21 
<212> DNA 

<213> Artificial Sequence 



PCT/US01/12334 



acgtggcttt gggtaagcaa caatgaccgg 



gcagacctca gcgggtgcga aaagggtttt 



<400> 70 

ctgagttgga tgtattggat c 
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<210> 71 
<211> 28 
<212> DNA 
5 <213> Artificial Sequence 

<400> 71 

ggatccatgg ttacttcgac aaaaatcc 

10 <210> 72 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
15 <400> 72 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt gctaggcaa 

<210> 73 
<211> 28 
20 <212> DNA 

<213> Artificial Sequence 

<400> 73 

gaattcttaa cccaacagta aagttccc 



25 



30 



28 



c cgcttagtac 60 



28 



<210> 74 
<211> 63 
<212> DNA 

<213> Artificial Sequence 
<400> 74 



ggtatgagtc agcaacacct tottcacgag gcagacctca gcgccggcat tgtcttttao 60 

atg 

63 

35 <210> 75 
<211> 20 
<212> DNA 

<213> Artificial Sequence 



38 



WO 01/79472 

<400> 75 

ggaacccttg cagccgcttc 

<210> 76 
5 <211> 22 
<212> DNA 

<213> Artificial Sequence 



PCT/US01/12334 



20 



<400> 76 
10 gtatgcccaa ctggtgcaga gg 

<210> 77 
<211> 28 
<212> DNA 
15 <213> Artificial Sequence 

<400> 77 

ggatccatgt ctgacacaca aaataccg 
20 <210> 78 

<211> 62 
<212> DNA 

<213> Artificial Sequence 
25 <400> 78 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt cgccaatacc agccaccaac 
ag 

<210> 79 
30 <211> 27 
<212> DNA 

<213> Artificial Sequence 
<400> 79 

35 gaattctcaa atccccgcat ggcctag 

<210> 80 
<211> 65 
<212> DNA 



22 



28 



60 
62 



27 



39 



WO 01/79472 
<213> Artificial Sequence 
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<400> 80 



ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggcctacg gcttggacgt 60 

gtggg 



65 



21 



<210> 81 
<2ll> 21 
<212> DNA 

<213> Artificial Sequence 
<400> 81 

cacttggatt cccctgatct g 

<210> 82 
<211> 21 
<212> DNA 

<213> Artificial Sequence 
<400> 82 

gcaatacccg cttggaaaac g 

<210> 83 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<400> 83 

ggatccatga ccgaatcttc gcccctagc 

<210> 84 
<211> 61 
<212> DNA 

<213> Artificial Sequence 
<400> 84 

gcaatgtaac atcagagatt ttgagacaca acgtggcttt caatcctagg tagccgaggc so 
9 



21 



29 



40 



5 



WO 01/79472 „ T _ , 

PCT/US01/12334 

<210> 85 
<211> 27 
<212> DNA 

<213> Artificial Sequence 



27 



<400> 85 

gaattcttag cccaggccag cccagcc 

<210> 86 
0 <211> 66 
<212> DNA 

<213> Artificial Sequence 
<400> 86 

5 ggtatgagtc agcaacacct tcttcacgag gcagacctca gcggggaatt gatttgttta 
attacc 

66 

<210> 87 
<211> 21 
0 <212> DNA 

<213> Artificial Sequence 

<400> 87 

gcgatcgcca ttatcgcttg g 



60 



21 



<210> 88 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<400> 88 

gcagactggc aattatcagt aacg 

<210> 89 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<400> 89 



24 



41 



WO 01/79472 
ccatggattc gagtaaagtt gtcgc 



PCT/US01/12334 



25 



<210> 90 
<211> 0 

<213> Artificial Sequence 
<400> 90 

gaattcactt caaaaaaggt aacag 

<210> 91 
<211> 4550 
<212> DNA 

<213> Arabidopsis sp 



SO 
120 
180 
240 
300 



<400> 91 

attttacacc aatttgatca cttaactaaa ttaattaaat tagatgatta tcccaccata 
tttttgagca ttaaaccata aaaccatagt tataagtaac tgttttaatc gaatatgact 
cgattaagat taggaaaaat ttataaccgg taattaagaa aacattaacc gtagtaaccg 
taaatgccga ttcctccctt gtctaaaaga cagaaaacat atattttatt ttgccccata 
tgtttcactc tatttaattt caggcacaat acttttggtt ggtaacaaaa ctaaaaagga 
caacacgtga tacttttcct cgtccgtcag tcagattttt tttaaactag aaacaagtgg 360 
caaatctaca ccacattttt tgcttaatct attaacttgt aagttttaaa ttcctaaaaa 420 
agtctaacta attcttctaa tataagtaca ttccctaaat ttcccaaaaa gtcaaattaa 
taattttcaa aatctaatct aaatatctaa taattcaaaa tcattaaaaa gacacgcaac 
aatgacacca attaatcatc ctcgacccac acaattctac agttctcatg ctaaaccata 
ttttttgctc tctgttcctt caaaatcatt tctttctctt ctttgattcc caaagatcac 
ttctttgtct ttgatttttg attttttttc tctctggcgt gaaggaagaa gctttatttc 
atggagtctc tgctctctag ttcttctctt gtttccgctg gtaaatctcg tccttttctg 780 
gtttcaggtt ttatttgttg tttaggtttc gtttttgtga ttcagaacca tacaaaaagt 840 
ttgaactttt ctgaatataa aataaggaaa aagtttcgat ttttataatg aattgtttac 900 
tagatcgaag taggtgacaa aggttattgt gtggagaagc ataatttctg ggcttgactt 960 
tgaattttgt ttctcatgca tgcaacttat caatcagctg gtgggttttg ttggaagaag 1020 
cagaatctaa agctccactc tttatcaggt tcgttagggt tttatgggtt tttgaaatta 1080 
aatactcaat catcttagtc tcattattct attggttgaa tcacattttc taatttggaa 1140 
tttatgagac aatgtatgtt ggacttagtt gaagttcttc tctttggtta tagttgaagt 1200 
gttactgatg ttgtttagct ctttacacca atatatacac ccaattttgc agaaatccga 1260 
gttctgcgtt gtgattcgag taaagttgtc gcaaaaccga agtttaggaa caatcttgtt 1320 
aggcctgatg gtcaaggatc ttcattgttg ttgtatccaa aacataagtc gagatttcgg 
gttaatgcca ctgcgggtca gcctgaggct ttcgactcga atagcaaaca gaagtctttt 



480 
540 
600 
660 
720 



1380 
1440 
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10 



15 



20 



25 



30 



35 



agagactcgt tagatgcgtt ttacaggttt tctaggcctc atacagttat tggcacagtt 
aagtttctct ttaaaaatgt aactctttta aaacgcaatc tttcagggtt ttcaaggaga 
taacattagc tctgtgattg gatttgcagg tgcttagcat tttatctgta tctttcttag 
cagtagagaa ggtttctgat atatctcctt tacttttcac tggcatcttg gaggtaatga 
atatataaca cataatgacc gatgaagaag atacattttt ttcgtctctc tgtttaaaca 
attgggtttt gttttcaggc tgttgttgca gctctcatga tgaacattta catagttggg 
ctaaatcagt tgtctgatgt tgaaatagat aaggtaacat gcaaattttc ttcatatgag 
ttcgagagac tgatgagatt aatagcagct agtgcctaga tcatctctat gtgggttttt 
gcaggttaac aagccctatc ttccattggc atcaggagaa tattctgtta acaccggcat 
tgcaatagta gcttccttct ccatcatggt atggtgccat tttcacaaaa tttcaacttt 
tagaattcta taagttactg aaatagtttg ttataaatcg ttatagagtt tctggcttgg 
gtggattgtt ggttcatggc cattgttctg ggctcttttt gtgagtttoa tgctcggtac 
tgcatactct atcaatgtaa gtaagtttct caatactaga atttggctca aatcaaaatc 
tgcagtttct agttttaggt taatgaggtt ttaataactt acttctacta caaacagttg 
ccacttttac ggtggaaaag atttgcattg gttgcagcaa tgtgtatcct cgctgtccga 
gctattattg ttcaaatcgc cttttatcta catattcagg tactaaacca ttttccttat 
gttttgtagt tgttttcatc aaaatcactt ttatattact aaagctgtga aactttgttg 
cagacacatg tgtttggaag accaatcttg ttcactaggc ctcttatttt cgccactgcg 
tttatgagct ttttctctgt cgttattgca ttgtttaagg taaacaaaga tggaaaaaga 
ttaaatctat gtatacttaa agtaaagcat tctactgtta ttgatgagaa gttttctttt 
ttggttggat gcaggatata cctgatatcg aaggggataa gatattcgga atccgatcat 
tctctgtaac tctgggtcag aaacgggtac gatatctaaa ctaaagaaat tgttttgact 
caagtgttgg attaagatta cagaagaaag aaaactgttt ttgtttcttg caaaattcag 
gtgttttgga catgtgttac actacttcaa atggcttacg ctgttgcaat tctagttgga 
gccacatctc cattcatatg gagcaaagtc atctcggtaa caatctttct ttacccatcg 
aaaactcgct aattcatcgt ttgagtggta ctggtttcat tttgttccgt tctgttgatt 
ttttttcagg ttgtgggtca tgttatactc gcaacaactt tgtgggctcg agctaagtcc 
gttgatctga gtagcaaaac cgaaataact tcatgttata tgttcatatg gaaggttaga 
ttcgtttata aatagagtct ttactgcctt tttatgcgct ccaatttgga attaaaatag 
cctttcagtt tcatcgaatc accattatac tgataaattc tcatttctgc atcagctctt 
ttatgcagag tacttgctgt tacctttttt gaagtgactg acattagaag agaagaagat 
ggagataaaa gaataagtca tcactatgct tctgttttta ttacaagttc atgaaattag 
gtagtgaact agtgaattag agttttattc tgaaacatgg cagactgcaa aaatatgtca 
aagatatgaa tttctgttgg gtaaagaagt ctctgcttgg gcaaaatctt aaggttcggt 
gtgttgatat aatgctaagc gaagaaatcg attctatgta gaaatttccg aaactatgtg 
taaacatgtc agaacatctc cattctatat cttcttctgc aagaaagctc tgtttttatc 
acctaaactc tttatctctg tgtagttaag atatgtatat gtacgtgact acattttttt 
gttgatgtaa tttgcagaac gtatggattt ttgttagaaa gcatgagttc gaaagtatat 
gtttatatat atggataatt cagacctaac gtcgaagctc acaagcataa attcactact 
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1620 

1680 

1740 
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1920 
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5 



atagtttgct 
gcgttttgtg 
agaagaatat 
gcggctttgc 
tgattgacca 
tcaagtacgt 
agtcacaatg 
ttcgtttgtc 
tagagtgaat 
atgaacaatt 
agttgaacta 
ataaacattt 
gactaatttg 

<2X0> 92 
<211> 4450 
<212> DNA 

<213> Arabidopsis sp 



ctgtaataga 
gttgatactg 
aggctcacgg 
caaagaccga 
ttgcttagag 
gtcagatcat 
cttaatgggc 
ccctggtggt 
ctagtagagt 
ctttttgtaa 
acttcgtgca 
cgacgtacca 
tacaatgaat 



tagttccatt 

actactgagt 

gaacgactgt 

gtcacgatcg 

acgcattgga 

acgatgtagg 

ttattggccc 

gagtattatt 

cctagaccat 

ggaaaacttt 

attgcataat 

agagttcgaa 

ggttaataaa 



gatgtcttga 

gttctttgtg 

Sgtggaagat 

agtctatgaa 

atcttactag 

agatttcacg 

aataatagct 

agggtatggt 

ggtccatggc 

tatatagtag 

aatggtgtga 

acaataagca 

ccattgaagc 



aactgtacgt 

agtgttgtaa 

gaaatggaga 

gtctttacag 

ggacttgcct 

gctttgatgt 

agctcttttg 

gtgaccaaag 

ttttatttgt 

acgtttacta 

aatagagggt 

aaatagattt 

ttttattaat 



aactgcctgg 3840 

gtatacaaga 3900 

tcatcacgta 3960 

ctgctgatta 4020 

gggagtttct 4080 

gtttgtttgg 4140 

ctttagccgt 4200 

tcaccagacc 4260 

aatttgaaaa 4320 

tatagaaact 4380 

gcaaaactca 4440 

ttttgcttca 4500 

4550 



3 <400> 92 
tttaggttac 
cttgtttgac 
aaagacatgg 
cactaatcta 
cctgaaggag 
aaaagtctac 
ttgacagaga 
agcaccaaac 
caaggcaaat 
ctttctactg 
tcctgatgca 
aaaaagaatc 
attgagtacc 
caaatccagt 
caacttttac 
aaaaaaatga 
gaaattttgg 
aactttcttt 
taaagtatag 



aaaatcaatg 
cagaaggtca 
atcccaaaca 
aaagagttaa 
acccactttg 
ttcaattctt 
gagagtcttt 
cacttgttcg 
cacataattg 
cagtcagcac 
gcggccagtg 
aaaagacagt 
gagatctgca 
taaccaaagc 
atcatcttct 
tttaacctag 
gttcgtagct 
tctcacttct 
aaatcagatg 



atattgcgta 
tgatcattgt 
acaacaatag 
gtttcagctt 
tagcaagacc 
catatatagg 
attgaaaact 
acacaaatct 
gattgtgaaa 
cagatgataa 
atgcgtaata 
aaatggaatt 
ctgaatccag 
tttgtattat 
ttgtcctgga 
aatatctcaa 
tgtggcatat 
gttgcaaacg 
gaaaaggtgg 



tgtcaactat 
atacatacag 
cttcttttac 
ttctggcaat 
atgtcctctg 
ttcctcacac 
tcttccaagt 
gtacagatat 
gagtacaaaa 
gtcagctgtc 
ttgccaccct 
aggaatcaca 
aaagtgcaag 
caccgaatct 
gacacaatat 
aattacttgc 
actatttcat 
ggaagacttt 
gagatcaggg 



aaaagccaaa 
ccaaactacc 
aagaaccagt 
ggctccttga 
tttcacttac 
tacagcttca 
acaactccac 
aaaaacacta 
gataaaccca 
cctatttgcc 
taatcattag 
aatgagtcct 
aaaacctatg 
aagggctgtt 
attagacatt 
ataaaaactg 
fcttcaatggg 
tatggggcta 
taattttctt 



agtaaagcct 60 

tcctggaaga 120 

agtaactagt 180 

tcatttcaat 240 

agtgtgtctc 300 

tcctcattcg 360 

taaatataat 420 

ttaggttttc 480 

aattttcata 540 

atcctaactg 600 

agcgagaaac 660 

tgtaaagttt 720 

gatgctgtgc 780 

gacttaacac 840 

agtccatgga 900 

aacttgagct 960 

ccacaaaggt 1020 

actcttcact 1080 

ctttatgatt 1140 * 
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gacaaaagtc 
agaaatctgt 
gattaactac 
gaatatacct 
5 tagagaggga 
gctccccagt 
atcataagaa 
ttacccaaaa 
cccctaaaac 
1 0 atcacaaaac 
gggtgaacca 
gcaagtaaaa 
aaagcaactg 
taggtcttag 
1 5 taaaacaaaa 
tggaagaatg 
ttcaatcgaa 
gaacgaagtc 
tgttgtcgta 
20 aacactttcc 
catactaaag 
aaacaaactg 
acctctaaga 
tccaggatca 
25 atctatttac 
ctaaagactt 
tatataaaat 
cttaaccact 
atgagctctt 
3 0 agactgcaag 
cagacataaa 
accaaccagg 
acataaacca 
agctattctt 
35 acattactca 
caatgggttt 
aatctatcca 
tcggctttcc 
aaccaacacc 



gaacatcgaa 
Srgtggtgaag 
tttgctactg 
gatgtgcata 
gtacaataga 
ttatggtcaa 
aatcagaaaa 
tgtaaacctc 
acggctgcag 
taaagacaag 
tatgtgtatg 
aatccaaaca 
cagcccgaga 
ttttgtacga 
caaccataca 
aatccagtta 
aaacatattc 
atcagaacat 
aattgatcca 
caccatggtt 
ggatatataa 
acctttgtat 
agtaatgctc 
gcagccaacg 
atagctctgg 
ccaaacagat 
caaagaaaac 
ctcccatgct 
gggaagatca 
aactactcca 
ttcttttatc 
aaaacacata 
tcctttggga 
tcggatggat 
aaggcgaaga 
atccaatcga 
agaagcttcc 
ctccaaaacc 
aaaaaacttc 



atggatgcat 
ctagaaaaag 
gtcataatca 
aatagtatca 
tggtgctatg 
acctaaaaag 
tatataatgt 
ttcataagtg 
aatatacata 
acctgagaac 
tgaattttta 
aacctgtaat 
aatccaatcc 
tcaacctgga 
aaatcttgag 
catgaatgct 
caccttcacc 
gcagataagc 
acatagaaaa 
acagaaacca 
atttgacatc 
ctatgtcctg 
cgcaaccaaa 
caatcgacct 
aactagatcc 
tcctgagtaa 
tcaggtttat 
atcaaaaacc 
ttatggattt 
aacttctcca 
aagcttcaag 
actttatcac 
cgaaaggaaa 
tataatgaat 
taaacttacc 
gcaagcttag 
ttaacaacaa 
gaagaagacg 
tcctgatgca 



ttgcatgaga 
aaaacaaagc 
aatagatttt 
taaacaaggg 
cttcctttaa 
gcttgaggct 
ctaactttga 
ggtaggaaaa 
ctgaaatgag 
atatcttcag 
aacaaacact 
tgttaagttg 
cttgaaatgg 
tataaaagaa 
ctttacatac 
gtgtatctac 
atatctaaca 
tattacccaa 
atcaagacca 
tagttacaca 
actttatcac 
atcaagcaga 
taaagccata 
atacaacaat 
atgacgaaac 
gaaacccagt 
agcattatcc 
tcagctcaag 
gataactgaa 
ctgatatgta 
agcaagttag 
ataaaactaa 
ctatataaac 
ctcaaaagtg 
acatacaagg 
cataacctct 
caccatcact 
acgacattcc 
attctcttcc 



catgaaacaa 
aagcaatatg 
gaagctaaaa 
tccagcagac 
ctgcagtcca 
gcaattataa 
gaagccagaa 
gacaagtaac 
ctcaagtaga 
aatttgggcc 
tgcaaatacg 
gagaagaatc 
tgtcaaaaga 
atttgtaaga 
aagcaaccca 
cctaactact 
cctgaagtct 
aacagagata 
gttccagatg 
aaacatgttt 
cataccataa 
tcatttatag 
tatttaaaac 
gatggagatt 
atggaacatc 
ggaactatag 
aatcctgatt 
atcatactac 
aaaagtaaca 
tgtagtctaa 
tcagaaaaca 
atttaatgta 
atgcagtctt 
aaatgtcttg 
ccacgcaagc 
aacttcttct 
cttctcctta 
acaaattaat 
tttactccat 



aagctgaaaa 
cacacattga 
aataaaaagt 
tccggagaga 
tcctaacaat 
aaacgaatca 
tagatttaaa 
aaagatgaag 
aaagaatttg 
aactacataa 
cgactttagg 
cctaagccta 
ccactggcga 
caacataatc 
tctttgttta 
aaacacatat 
ttcacttttt 
tgactggaaa 
tcaaagcaat 
cctaaaccaa 
gatagcttaa 
tacaaccagc 
ttggaaggct 
cagagtatcg 
gttataatat 
tactgtaaca 
tctgccaatc 
ctaattgcct 
gagaaatagc 
caataataaa 
tcacagccaa 
atctgactta 
tctttccctc 
attctcagct 
aaccaagttc 
ggtaaataca 
tcatctttct 
ctgtaattcc 
acttggtaat 
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<210> 93 
20 <211> 2850 
<212> DNA 

<213> Arabidopsis sp 



3720 
3780 
3840 



tatcattcca tgaaggataa cacttagtga aaggatttgt gtaatgggta gtcacaggat 3540 
tggacaagga tttatgttgt gattgcaaaa gagcagagga agaagatgga gttacggaga 3600 
cggaagattt caacaaccgt cttgaaacac gggagagccc aaaaaacgcc atctttgaga 3660 
gaaattgttg cctggaagaa acaaagactt gagatttcaa acgtaagtga attcttacga 
acgaaagcta acttctcaag agaatcagat tagtgattcc tcaaaaacaa acaaaactat 
ctaatttcag tttcgagtga tgaagcctta agaatctaga acctccatgg cgtttctaat 
ctctcagaga taatcgaatt ccttaaacaa tcaaagctta gaaagagaag aacaacaaca 3900 
acaacaaaaa aaatcagatt aacaaccgac cagagagcaa cgacgacgcc ggcgagaaag 396O 
agcacgtcgt ctcggagcaa gacttcttct ccagtaaccc ggatggatcg ttaatgggcc 4020 
tgtagattat tatatttggg ccgaaacaat tgggtcagca aaaacttggg ggataatgaa 4080 
gaaacacgta cagtatgcat ttaggctcca aattaattgg ccatataatt cgaatcagat 4140 
aaactaatca acccctacct tacttatttc tcactgtttt tatttctacc ttagtagttg 
aagaaacact tttatttatc ttttcgggac ccaaatttga taggatcggg ccattactca 
tgagcgtcag acacatatta gccttatcag attagtgggg taaggttttt ttaattcggt 
aagaagcaac aatcaatgtc ggagaaatta aagaatctgc atgggcgtgg cgtgatgata 
tgtgcatatg gagtcagttg ccgatcatat ataactattt ataaactaca tataaagact 
actaatagat 



4200 
4260 
4320 
4380 
4440 
4450 
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30 



35 



120 
180 



<400> 93 

aattaaaatt tgagcggtct aaaccattag accgtttaga gatccctcca acccaaaata 60 
gtcgattttc acgtcttgaa catatattgg gccttaatct gtgtggttag taaagacttt 
tattggtcaa agaaaaacaa ccatggccca acatgttgat acttttattt aattatacaa 

gtacccctga attctctgaa atatatttga ttgacccaga tattaatttt aattatcatt 240 

tcctgtaaaa gtgaaggagt caccgtgact cgtcgtaatc tgaaaccaat ctgttcatat 300 

gatgaagaag tttctctcgt tctcctccaa cgcgtagaaa attctgacgg cttaacgatg 360 

tggcgaagat ctgttgttta tcgtttctct tcaagaatct ctgtttcttc ttcgttacca 420 

aaccctagac tgattccttg gtcccgcgaa ttatgtgccg ttaatagctt ctcccagcct 480 

ccggtctcga cggaatcaac tgctaagtta gggatcactg gtgttagatc tgatgccaat 540 

cgagtttttg ccactgctac tgccgccgct acagctacag ctaccaccgg tgagatttcg 600 

tctagagttg cggctttggc tggattaggg catcactacg ctcgttgtta ttgggagctt 660 

tctaaagcta aacttaggta tgtgtttact tttcttttct catgaaaaat ctgaaaattt 720 

ccaattgttg gattcttaaa ttctcatttg ttttatggtt gtagtatgct tgtggttgca 780 

acttctggaa ctgggtatat tctgggtacg ggaaatgctg caattagctt cccggggctt 840 
tgttacacat gtgcaggaac catgatgatt gctgcatctg ctaattcctt gaatcaggtc 
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30 



attgaaatgt 
gcttgcttat 
cgcttgtttg 
gaacgatgct 
ctactattgc 
ttatatgtga 
gcttccacga 
cttgtacttt 
gttggcgctg 
ctttatttta 
ttcaccaatt 
gatttcatac 
tatggccctt 
gtcatatgag 
gtggaatgat 
atgagttctt 
acttctgatt 
ccgtcaggga 
ggtttcatcg 
tgcattgctg 
caagttggtt 
cattctaccg 
tccttcctgt 
aactcgtaga 
gaaagaaacg 
tcccagctcc 
aacagaaatt 
gtggagaacg 
ctaagtatgt 
aatttttgag 
gaaatgaaat 
aggtctcgag 
cacgaagatg 



tgagaagttc 
gtttatgtag 
ttttttcatt 
aaggccattg 
tggtgcttct 
tttctttgtt 
caaggttatt 
atgcgtttgt 
ttgttggtgc 
gcagattctg 
ctatgcttat 
aattcgatga 
gcacatctct 
attagaatgt 
cagagtgtcc 
tccgttagag 
ttgtttcttg 
agagaatagc 
cctatgactg^ 
tatctgattt 
ttgcctcgaa 
agaccggacc 
tttcatgtct 
agaagccgga 
tgtggctcaa 
ttccttctac 
aaaaaaaaaa 
catacaagtt 
ttcaaatgat 
ctttgacgtg 
ccgataaccg 
tctcgacggc 
gcgatgaggt 



ataaatttcg 
ttgaaaagtt 
ttctagattt 
ccttcaggac 
ggtgcttgtt 
ttatgaatgg 
gcagactaat 
ttatactccg 
tatcccaccc 
ttttgttgga 
ctattttgtg 
ttcttccagc 
gccgcaatga 
ctccttccat 
tagatagtgt 
ataaacattc 
gtaccttgtt 
agcagtggct 
tgagtcttgt 
ttgctgttcc 
tcaacacttc 
atgcataaag 

ggtcttcttc 
ttaacaaatt 
cctccggtgg 
tctccatgat 
tctgaaaagt 
tatgtatttt 
acaaaataca 
ttaggtctat 
atgatggtgt 
tgcggaaatc 
tgaaatcaat 



aatccttgtt 
taaaaatttc 
ttgagataag 
gtattagtgt 
tgttggccag 
gtgattgaga 
atgttggctg 
ttgaagcaac 
ttgcttgggt 
tactgctttt 
tgttgtcagg 
tgctctttac 
ttatgcagct 
gtagtgttga 
cacagcagtc 
gcgaacattg 
ttcagttaca 
ctaaggaact 
agattcatct 
ttccaatttt 
tcacactagc 
caaggaaaat 
tacaccgtgt 
ctgtatctgg 
cttatgcctc 
aacctttaag 
tcttaagttt 
ttctcatctc 
tactttatca 
ctaataaacg 
agagttaaac 
ggaaaatcac 



gtgtttatgt 
taatccttgg 
caatgattct 
tccacacgct 
caaggtgaat 
gattatggat 
ctggacttgc 
ttcaccctat 
aaatttttgt 
aattcaaaat 
tgggcggcag 
ttttggcaga 
ggagggtaag 
tcttgaacta 
gacattttag 
tttccagctt 
agatgttgtc 
gcttttacat 
tttttttgta 
tgtgacaggg 
aatcgctgca 
gttccatgcc 
ctctaatgat 
tgaagtcaaa 
tgctgcaccg 
caagctattg 
aatctttggt 
cacataattg 
attatctgat 
tagtaacgaa 
gattaaaccg 
gattgaggac 



agttgatctt 
tagttgatct 
aagatgaaaa 
gttgcatggg 
gtttgttttt 
ctaaactttt 
atctgccaat 
caatacatgg 
tccttttctt 
gtagtcatgg 
cgtctggtca 
tacctcattt 
accatatggt 
gttcaatttc 
tggctagata 
ccgcgaccca 
actctttgat 
gatccctctc 
gtttattgac 

gggttaacct 

acagcatttt 
agtcttctct 
aatcagcaac 
actcagaggc 
tttcctttcc 
aatttttgga 
taataatgaa 
tattttttct 
caaattgatg 
tttggttttg 
ggttggttaa 
tttgagctgc- 
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<210> 94 
<211> 3660 
<212> DNA 

<213> Arabidopsis sp 
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<400> 94 
tatttgtatt 
tattagattt 
tcaatgaaac 
caatgaaggt 
tcgtgccaca 
atctccggta 
ccagtctccg 
gattctcaac 
agtcctctct 
gacgttgatc 
tcttcgattt 
gcgtgcggcg 
ctcttcaaaa 
cagacacctg 
ggagcatctc 
ttgtggataa 
caagtcttgt 
tgttgttata 
cacttggcct 
gaacctcttt 
ttcttctggc 
tgatgatgtc 
gctgacttgt 
ttccagacaa 
ccaattccat 
ttatctcttt 
ctgggtgcta 
gcccttctga 
tatcaggcag 
tcttatatat 
gtgataagac 
ttgcagctaa 
ttgccatggt 
ttgtttcttg 
ccagatgttg 
ttttatgttg 
tgataatgca 
cattaggact 



tttattgtta 
attttttggg 
cctaatgggt 
cctttgtcca 
tggcagttat 
acggacaacc 
gcttaactac 
actgtctcca 
cttcggaatt 
tacgaatcac 
tttttttact 
gagactgata 
ttctcttttc 
acaaggcacc 
aagaaactgt 
tgatgtgtta 
ttttagctta 
ttctgtattc 
ccactggttt 
tggatcatgc 
agggaacttt 
tggtccttgt 
tcttattcta 
tcaacgactg 
ctggagcaat 
tgtgattctg 
ttattgggag 
cattaactag 
ggcataccac 
actctgctcc 
actccatcca 
aacaaaatgg 
aagatatctc 
ctcacttgac 
ttgttctaac 
cttttttcgt 
gttaggaata 
tcagtctctc 



aattttatga 
ctttatttgg 
tttgtttggg 
acaaaactaa 
tgaatcaaag 
actcgtttcc 
caccagagac 
ccatccactc 
cggattccgt 
ccggtagtta 
gatcttgttg 
ctgataaagg 
aaacactgtg 
agccggtggt 
aattttgttc 
gtttaggaat 
gaaatgatgt 
agaataaatg 

ggggagtcgt 

aatactgtac 
cattggaccc 
cttactggct 
gtgcatttgc 
gtatgataga 
atcagagcca 
atttctcctt 
gtcttggtat 
tacagttaaa 
tcccactgtc 
acctcttaag 
agttttggag 
atgggttgga 
gtgtatcaat 
tgataggtgg 
actcttgtac 
tatctgttgt 
gccattgtta 
ccagtagctt 



tttcacccgg 
gttttcgatt 
ctttggattt 
catccgacac 
gccgccaaaa 
cgaaacagca 
gattctctct 
ttccagagtt 
tgagttcact 
gcattctgtt 
tggatctctc 
tatgattttt 
gcgtttgaat 
tcaagcatta 
atctcctcag 
tttcctacta 
gaaaatgttg 
gaagattcgt 
ctgtggtgct 
agaaagtttt 
cagaggatgt 
atacacaggt 
ttggtgctac 
gatatcgacg 
gaggtaactg 
actccttaaa 
tgctggaata 
gggcacatca 
ttctatcttg 
gtaagtttta 
ttttgaatat 
aattttgcac 
aatatatggc 
gctggccaag 
agcatagctg 
aatatgctct 
acgacttcaa 
ttggcaccga 



tatatatcat 
taaactgggc 
aaaccgggcc 
aactagtatt 
ctgtaacgta 
actcacagac 
tccgtcggtt 
acctccgtcg 
cgccggcgtt 
ggatagattg 
gtagggcgga 
tagttgtttt 
ttccgacggc 
accagcttct 
aatcttttaa 
aaggtaatct 
tttgttagct 
cttcagctta 
gctgcttcag 
ttcattttcc 
tgctaagtcg 
ctggttttac 
aataacctag 
caattaatga 
agacagaaca 
atgcaggtta 
ttagatgtgt 
gatttgctaa 
ctttgggagg 
ttcctaactt 
cgatatctga 
ttggagcaag 
gttgttctca 
cattgtttgg 
gggtactctt 
tgcttcatgt 
aagtgttgaa 
aactgcaaaa 



cccatattaa 
ccattctgct 
cattctgctt 
gccaagagga 
gacattactt 
tcacaccact 
ctatgacttc 
atcgagtcgg 
ctggtttctc 
atgaatgttt 
gafcttgttgt 
tattttctct 
agttaaatct 
cggtatcaaa 
attafccatat 
cttttgagga 
aaaaagagtt 
caaaaccagt 
gtaatcatac 
ttccaattgt 
attctttgca 
acaacaaaaa 
acttgtcgat 
gccatatcgt 
ttgtgagctt 
ttacacaagt 
gggtaagttg 
aatcttccct 
atcattgcta 
ccactctcta' 
actgatctca 
ctatattagt 
tctcattgat 
cactcttacg 
ttggcaaacc 
tgtacctttg 
ggagatagag 
tggatatgcg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
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ttggtgctat 
gcagctgtgg 
agcacaaatt 
tgagctaatg 
5 atttcatgct 
tcatttgtac 
gtagagatca 
aaccttatta 
agacgttaac 
10 ctcgcttcta 
agtaccaggt 
tctcttaatc 
cttggtgctc 
gatggggttt 
15 aactagttta 
gtatcaattt 
ttttgcattt 
tttcagaatg 
attctaaaca 
20 tttataatct 
gaaactacaa 
acctgatttt 
cacgttgttg 



agacattact 
cttctatttc 
aatgaagctg 
aagaggaggc 
tctaaaacaa 
ttttactagt 
tcattagtat 
tgcgttggcg 
agtctcacat 
taaactgcag 
aagtcaactt 
agaagttgct 
ggaatatttg 
tgtcgaaagc 
aaagattttg 
agcaaaacgg 
cctgctcata 
tttttgtttt 
tgtatccaca 
aaatctaaca 
agactagact 
tttctattct 
gacacaacat 



cagctttctg 
ttttccttga 
aatcaacaaa 
atctactttt 
gtattttcaa 
ggatgagtta 
atgtctattt 
ttggttgctt 
tataattaat 
tttaaatact 
agtacacatg 
tgaaacactc 
taacggcatt 
agaggtgttg 
taaaatgtat 
ctgag-aaatt 
tcgaggattg 
ctgtagtgga 
taaaaacagt 
act age tag t 
atacatatgt 
acagecattt 
actatcacaa 



ttgeeggtat 
tcttatcaac 
ggcaaaacat 
atgtttcatt 
cagtgtcatg 
cacaatcatt 
tggttgcagg 
tgatcattcc 
caaattcttg 
ttctcaagga 
tttgtgttct 
atcttgatta 
ageategcaa 
acacatcaaa 
gtaccgttat 
gtaattgatg 
gggtttatgt 
ttttaactat 
aatatacaaa 



aacccaacta 
tatttaacaa 
gatatgetge 
gtaagacacg 



gtactatcca 
tggatattca 
aaaagtacat 
agtgtgattg 
aaataacaga 
gttatagaac 
atatctatta 
tcagattgtg 
tcactcgtct 
ccctgtcaaa 
tttgaaatat 
caggcaagcg 
cactgaaaaa 
tgtgggcaag 
tactagaaac 
ttacegtatt 
tagttctgtc 
tttcatcact 
aatgatactt 
acttcataca 
cttgaaactg 
aatcttaaca 
aagtaaaacc 



ctgtttttgt 
ccaatggtaa 
tctaatgaaa 
atggattttc 
acttatatct 
caaatcaaag 
gcatctggga 
ttccaggtaa 
gattgetaca 
tacgaegtea 
ctttgagagg 
cgcagccatt 
ggcgtatttt 
tgatggcatc 
aactcctgtt 
tgcgctccat 
acttctctgc 
ttttgtattg 
cctcaaactt 
attaatttga 
tgttattact 
tatcaagtct 
aaccggcaac 



2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 



25 



<210> 
<211> 
<212> 
<213> 



95 

1236 

DNA 

SOY 



30 <400> 95 



35 



40 



atggattcac tgcttctfccg atctttccct aatattaata acgcctcttc tctcaccacc 
actggtgcaa atttctccag gactaaatct ttcgccaaca tttaccatgc aagttcttat 
gtgccaaatg cttcatggca caataggaaa atccaaaaag aatataattt tttgaggttt 
cggtggccaa gtttgaacca tcattacaaa ggcattgagg gagcgtgtac atgtaaaaaa 
tgtaatataa aatttgttgt gaaagegace tctgaaaaat ctcttgagtc tgaacctcaa 
gcttttgatc caaaaagcat tttggactct gtcaagaatt ccttggatgc tttctacagg 
ttttccaggc ctcacacagt tattggcaca gcattaagca taatttctgt gtctcttctt 



60 
120 
180 
240 
300 
360 
420 



49 



WO 01/79472 



PCT/US01/12334 



5 



gctgttgaga aaatatcaga tatatctcca ttatttttta ctggtgtgtt ggaggctgtg 480 
gttgctgccc tgtttatgaa tatttatatt gttggtttga atcaattgtc tgatgttgaa 540 
atagacaaga taaacaagcc gtatcttcca ttagcatctg gggaatattc ctttgaaact 
ggtgtcacta ttgttgcatc tttttcaatt ctgagttttt ggcttggctg ggttgtaggt 
10 tcatggccat tattttgggc cctttttgta agctttgtgc taggaactgc ttattcaatc 
aatgtgcctc tgttgagatg gaagaggttt gcagtgcttg ca gcgatgtg cattctagct 



15 



25 



30 



35 



40 



<210> 96 

<211> 1188 

<212> DNA 

<213> SOY 



50 



600 
660 
720 



780 



gttcgggcag taatagttca acttgcattt ttccttcaca tgcagactca tgtgtacaag 840 
aggccacctg tcttttcaag accattgatt tttgctactg cattcatgag cttcttctct 900 
gtagttatag cactgtttaa ggatatacct gacattgaag gagataaagt atttggcatc 960 
20 caatcttttt cagtgcgttt aggtcagaag ccggtgttct ggacttgtgt tacccttctt 1020 
gaaatagctt atggagtcgc cctcctggtg ggagctgcat ctccttgtct ttggagcaaa 1080 
attttcacgg gtctgggaca cgctgtgctg gcttcaattc tctggtttca tgccaaatct 
gtagatttga aaagcaaagc ttcgataaca tccttctata tgtttatttg gaagctattt 
tatgcagaat acttactcat tccttttgtt agatga 



1140 
1200 
1236 



SO 
120 



<400> 96 

atggattcga tgottcttcg atottttcct aatattaaca aogcttcttc tctcgccacc 
actggttctt atttgccaaa tgcttcatgg cacaatagga aaatccaaaa agaatataat 

tttttgaggt ttcggtggcc aagtttgaac caccattaca aaagcattga aggagggtgt 180 

acatgtaaaa aatgtaatat aaaatttgtt gtgaaagcga ootctgaaaa atcttttgag 240 

45 tctgaacccc aagcttttga tccaaaaagc attttggact ctgtoaagaa ttccttggat 300 

gctttctaca ggttttccag acctcacaca gttattggca cagcattaag cataatttct 360 

gtgtccctco ttgctgttga gaaaatatca gatatatctc cattattttt tactggtgtg 420 

ttggaggctg tggttgctgc cctgtttatg aatatttata ttgttggttt gaatcaattg 48 0 

tctgatgttg aaatagacaa gataaacaag ccgtatcttc cattagcatc tggggaatat 540 
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^ tcctttgaaa ctggtgtcac tattgttgca tctttttcaa ttctgagttt ttggcttggc 
- tgggttgtag gttcatggcc attattttgg gccctttttg taagctttgt gctaggaact 
5 gcttattcaa tcaatgtgcc tctgttgaga tggaagaggt ttgcagtgct tgcagcgatg 
tgcattctag ctgttcgggc agtaatagtt caacttgcat ttttccttca catccagact 
iq catgtataca agaggccacc tgtcttttca agatcattga tttttgctac tgcattcatg 
agcttcttct ctgtagttat agcactgttt aaggatatac ctgacattga aggagataaa 
gtatttggca tccaatcttt ttcagtgcgt ttaggtcaga agcoggtatt ctggacttgt 
15 gtftcottc ttgaaatagc ttatggagtc gccctcctgg tgggagctgc atctccttgt 1020 
otttggagca aaattgtcac gggtctggga cacgctgttc tggcttcaat tctctggttt 1080 
2Q catgccaaat ctgtagattt gaaaagcaaa gcttcgataa catccttcta tatgtttatt X140 
tggaagctat tttatgcaga atacttactc attccttttg ttagatga 



<210> 97 

25 <211> 395 

<212> PRT 

<213> SOY 



<400> 97 



30 



Met Asp Ser Met Leu Leu Arg Ser Phe Pro Asn lie Asa Asn Ala 



Ser 



10 ls 



35 



Ser Leu Ala Thr Thr Gly Ser Tyr Leu Pro Asn Ala Ser Trp His Asn 

20 25 30 

Arg Lys He Gin Lys Glu Tvr Asn Phe. To,, t*. , 

1 VJ - U - ttsn *ne Leu Arg Phe Arg Trp Pro Ser 

35 40 45 

40 Leu Asn His His Tyr Lys Ser lie Glu Gly G l y Cys Thr Cys Lys Lys 

55 60 

Cys Asn lie Lys Phe Val Val Lys Ala Thr Ser Glu Lys Ser Phe Glu 

45 /5 . 80 

Ser Glu Pro Gin Ala Phe Asp Pro Lys Ser lie Leu Asp Ser Val Lys 



95 



Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr 



100 105 — Val 

105 110 

Gly Thr Ala Leu Ser lie He Ser Val Ser Leu Leu Ala Val Glu Lys 
115 120 125 



600 
660 
720 
780 
840 
900 
960 



1188 
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lie Ser Asp He Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 
130 135 140 

5 Val Ala Ala Leu Phe Met Asn He Tyr He Val Gly Leu Asn Gin Leu 



150 



155 



10 



Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr Leu Pro 

165 i7 0 

Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He Val Ala 



180 



185 



160 

Leu Ala 
175 

Ser Phe 



190 



15 



Ser He Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro 
135 — — - 



200 



Leu 



205 



Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser He 



210 

20 Asn Val Pro Leu 
225 



215 



220 



Leu Arg Trp Lys Arg Phe Ala Val Leu Ala Ala 



230 



235 



Met 
240 



25 



Cys He Leu Ala Val Arg Ala Val He Val Gin Leu Ala Phe Phe Leu 

245 250 



255 



His lie Gin Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Ser 

260 



265 



270 



0 116 lf 5 ^ a Thr Ala Phe ™ Ser Phe Phe Ser Val Val lie Ala 



280 



285 



Leu Phe Lys Asp He Pro Asp He Glu Gly Asp Lys Val Phe Gly lie 

295 300 

35 Gin Ser Phe Ser Val Arg Leu Gly Gin Lys Pro Val Phe Trp Thr Cys 



315 



320 



Val He Leu Leu Glu He Ala Tyr Gly Val Ala Leu Leu Val Gly Ala 
40 325 330 335 

Ala Ser Pro Cys Leu Trp Ser Lys He Val Thr Gly Leu Gly His Ala 

340 345 3S0 



45 



Val Leu Ala Ser He Leu Trp Phe His Ala Lys Ser Val Asp Leu Lys 



360 



365 



Ser Lys Ala Ser He Thr Ser Phe Tyr Met Phe lie Trp Lys Leu Phe 



375 



380 



50 Tyr Ala Glu Tyr Leu 



Leu He Pro Phe Val Ara 
385 390 3 J 



55 



<210> 
<211> 
<212> 
<213> 



98 
411 
PRT 
SOY 
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<400> 98 



g Met Asp Ser Leu Leu Leu Arg Ser Phe Pro Asn He Asn Asn Ala Ser 

5 10 15 

Ser Leu Thr Thr Thr Gly Ala Asn Phe Ser Arg Thr Lys Ser Phe Ala 

30 

10 Asn lie Tyr His Ala Ser Ser Tyr Val Pro Asn Ala Ser Trp His Asn 

35 40 45 



15 



Arg Lys He Gin Lys Glu Tyr Asn Phe Leu Arg Phe Arg Trp Pro Ser 

55 60 



Leu Asn His His Tyr Lys Gly He Glu Gly Ala Cys Thr Cys Lys Lys 

75 80 



Cys Asn lie Lys Phe Val Val Lys Ala Thr Ser Glu Lys Ser Leu Glu 

85 90 95 

Ser Glu Pro Gin Ala Phe Asp Pro Lys Ser lie Leu Asp Ser Val Lys 

100 105 110 

25 Asn Ser Leu Asp Ala Phe Tyr Arg Phe Ser Arg Pro His Thr Val He 

115 12 ° 125 

Gly Thr Ala Leu Ser He He Ser Val Ser Leu Leu Ala Val Glu Lys 

135 140 

lie Ser Asp He Ser Pro Leu Phe Phe Thr Gly Val Leu Glu Ala Val 

155 ISO 

Val Ala Ala Leu Phe Met Asn lie Tyr lie Val Gly Leu Asn Gin Leu 

155 170 175 

Ser Asp Val Glu He Asp Lys He Asn Lys Pro Tyr Leu Pro Leu Ala 

180 1£J 5 190 

40 Ser Gly Glu Tyr Ser Phe Glu Thr Gly Val Thr He Val Ala Ser Phe 



30 



35 



195 200 205 

Ser lie Leu Ser Phe Trp Leu Gly Trp Val Val Gly Ser Trp Pro Leu 



45 



215 



220 



Phe Trp Ala Leu Phe Val Ser Phe Val Leu Gly Thr Ala Tyr Ser lie 



230 



235 



240 



50 



Asn Val Pro Leu Leu Arg Trp Lys Arg Phe Ala Val Leu Ala Ala Met 



245 250 255 



Cys lie Leu Ala Val Arg Ala Val lie Val Gin Leu Ala Phe Phe Leu 

260 265 2 7 0 

55 His Met Gin Thr His Val Tyr Lys Arg Pro Pro Val Phe Ser Arg Pro 

275 280 28S 
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Leu lie Phe Ala xhr Ala Phe Met Ser Phe Phe Ser Val ^ n- ^ 

255 300 
5 Leu Phe Lys Asp Ile P Asp Ile ^ ^ ^ ^ ^ ^ 

J15 320 
Gin Ser Phe ser v al Arg Leu Gly Gln Lys pro ^ ^ ^ . 

10 5 330 33S 

• Val Thr Leu Leu Glu n. Ala Tyx Gly Val Ala Leu Leu Val G1 y Ala 

345 350 

Ala Ser Pro Cys Leu Trp Ser Lys lie Phe Thr Hv t „i 
15 3S5 * ±±G Thr Gly Leu Gly His Ala 

360 365 

Val Leu Ala Ser He Leu Tro Phe »■*« ai= t 

370 hS HlS Ala L ^ s Ser v al Asp Leu Lys 



375 380 



20 Ser Lys Ala Ser lie Thr Ser Phe Tyr Met Phe ll e t™ r T 

385 7Qn y uet * ne IJ - e Trp Lys Leu Phe 

390 395 400 

Tyr Ala Glu Tyr Leu Leu He Pro Phe Val Arg 
25 405 410 

<210> 99 

<211> 964 

<212> DNA 

<213> RXCE 



30 



<400> 99 



^gcagcact gggtcttaca ttccaat gga gctcgcctgt tgctttcatt acatgcttcg 
^ tga ctttatt tgctttggtc afctgctataa ccca ^ tt 

ggaagtatca aatatcaact tt g gcgacaa agctcggtgt 
actctggttt att gat a g= a aattatgttg ctgctatfcgc 
40 agg Cttt cag gcgcactgta atggtgcctg tgcatgcfcgc 

tccagacatg ggttctggag caa g caaaat atactaagga tgctatttca cagtactacc 
^ ggttcatttg gaatctcttc tat g ct g aat acatcttctt CCC gt t g ata tagagaccaa 
gcaatctgat atggtctgca tgttgagtgc ggcaaaaact agaagcccat atgaacagtg 
SSagtagggg aacgaacatg ccatccatgg gaagactctg ataactctct ctcgcocggg 
50 ctgtaaaggg taagcactgt tggg catata tatgaaagga aggtgataaa gcagggafcgc 
taaattgcta ctgggatcot caaaggctta tagtggtcac cagtggaatg tgccttaata 
atttggttac ccagcagagc aagtttttgc aggttattag gtaatatctt tgagggaatg 



60 
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240 
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360 
420 
480 
540 
600 
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720 



54 



WO 01/79472 



PCT/US01/12334 



5 



aacttagatt tcattgtttt aaggtctggt cacacaacgg gtagtagtgc tggagcggca 
aaaaacgacc ttgttttaca ctaccaaggg aggttaactc tagttttcat gtgaccactt 
accttgagag ttgagaccat ggaatcactt gtcgactcct cggcttgtat atttctagtg 
tcagcatttg cattctcctc cccacttgta cttgaaaagt tgaagacaac ttttttgttt 
10 gtgt 



<210> 100 

<211> 421 

15 <212> DNA 

<213> WHEAT 

<400> 100 



35 



40 



45 



<210> 101 

<211> 705 

<212> DNA 

<213> LEEK 



780 
840 
900 
960 
964 



20 


cgtccgcgga cgcgtgggtg cttattcagt 


caatctgccg cactttctat 


ggaagagatc 


60 




tgctgttgtt 


gcagcactct 


gcatattagc 


agtgcgtgcg gtgatagttc 


aactggcatt 


120 




ttttctccac 


attcagacat 


ttgttttcag 


aaggccggca gacttttcaa 


agccattgat 


180 


25 


atttgcaact 


gccttcatga 


cattcttctc 


agttgtaata gcattattca 


aggatatacc 


240 




cgatafctgaa 


STSTggaccgca 


tctttggaat 


ccaatctttt agtggtagac 


taggtcaaag 


300 


30 


caggggtttc 


tggacttgcg 


ttggcctact 


tgaggttgcc tacggtgttg cgatactgag 


360 




srggggtaact 


tcttccagtt 


tgtggagcaa atctataact gttgtgggcc 


atgcaatcct 


420 




c 










421 



<400> 101 

gtttcccccc ctcgaatttt tttttttttt ttttacttoa tttttctgtg aataaattct 
taaaaaagac aaagaaaacc actggatatc ctaaattcaa cataggctat tgtcattoaa 
tgataatctt taacacaaca tacaacatga atataattaa ggagaaatga tctgcaattg 
ttgaaagaac tctccgtttt taagatgaca attaaagcgt tgttaattcc agccatttct 
50 gcctccatta tctactcatc ttctcttgcg attcttttcc atgtaggtca taaaccctca 300 



60 
120 
180 
240 
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tcttacaaaa ggaatgagca agtactcagc atagaagagc ttccacacga acatataaaa 
agatgtaata gtggttttgg tcattggtcc atgagatcta gcacgattcc aaagtaacga 
cccaagaatt gcatgaccta tcactgttaa gcatttgctc cataggcatg aggaagtagc 
tccaacaacc atgacaacag tgtaggccat ctcaaggaga tatatacata tccaaaacac 
cctctcctgg ccaaggcgca cgctgaaaga atggatgcca aatattttgt ctccgtctat 
atcaggtata tccttaaata gagcaataac aactgagaag aagctcatga aggcagttgc 
aaatatcaat ggccttgtga aacttgctgg tcttttgaaa acaaa 



360 

420 

480 

540 

600 

660 

705 



<210> 102 

<211> 637 

<212> DNA 

<213> LEEK 

<220> 

<22l> misc_feature 

<222> (1)..{637) 

<223> n * g, a, t or c 



<400> 102 

nattcggcac gagttttgaa gaagttaagc atggactccc tccttaccaa gccagttgta 
atacctctgc cttctccagt ttgttcacta ccaatcttgc gaggcagttc tgcaccaggg 
cagtattcat gtagaaacta caatccaata agaattcaaa ggtgcctcgt aaattatgaa 
catgtgaaac caaggtttac aacatgtagt aggtctcaaa aacttggtca tgtaaaagcc 
acatccgagc attctttaga atctggatcc gaaggataca ctcctagaag catatgggaa 
gccgtactag cttcactgaa tgttctatac aaattttcac gacctcacac aataatagga 
acagcaatgg gcataatgtc agtttctttg cttgttgtcg agagcctatc cgafcatttct 420 
cctctgtttt ttgtgggatf attagaggct gtggttgctg cattgtttat gaatgtttac 480 
attgtaggtc tgaatcaatt atttgacata gaaatagaca aggtcaataa acctgatctt 540 
cctcttgcat ctggagaata ctcaccaaga gctggtactg ctattgtcat tgcttcagcc 
atcatgagct ttggcattgg atggttagtt ggctctt 



60 
120 
180 
240 
300 
360 



600 
637 



WO 01/79472 



PCT/US01/12334 



<210> 103 

<211> 677 

<212> DNA 

5 <213> CANOLA 



5 



<400> 103 

tttttttttt tttttttcaa aaagaccaat cctttagtat gtacatgaac aaagtgattt 
0 tgtctccaag ctaoaaagaa gaagaagaga ggtatacaaa gaaaactaca aatgttcacc 
atgaatgcta gaagaagggg aataacagat actctgcgta gaagagattc catataaacc 
ggtaatatcc tgctatagct tcctttgtgt agtttgcttt ttctagcacc catgtctgga 
aaaccaagca tgaagccaag atcatatgtg caggaatcat caagctacct ctaaaaacct 
gaggcatgta gaaagctagt gatatggcag aaatatagtt caotagcaga agtccagaac 
) cgaggaatgc aatgttcctc actccaagct ttgttgctag tgttgatatt tggaacttgc 
gatctccttc aacatcagga agatcttttg taatagcaat gactagtgca aaoagtgtca 
. caaaagacgb gatgaaagcc acaggtgcac tccactgaaa cgaaagtcca agagcagctc 
tagtagcatg gtacacacca aaattaagaa gaaaacctcg taccgtggca ataataagaa 
acgctgcaac tggaaatctc ttcattctaa atggtggaac agaatagatg gtccccagat 
cggacgcgtg ggtcgac 



<210> 104 

<211> 1431 

<212> DNA 

<213> CORN 

<400> 104 



ccacgcgtcc gcccggccaa gggatggacg cgcttcgcct acggccgtcc ctcctccccg 
tgcggcccgg cgcggcccgc ccgcgagatc attttctacc accatgttgt tccatacaac 
gaaatggtga aggacgaatt tgcttttcta gccaaaggac ccaaggtcct accttgcatc 
accatcagaa attcttcgaa tggaaatcct cctattgtag gatatcacat cggtcattaa 
atacttctgt taatgcttcg gggcaacagc tgcagtctga acctgaaaca catgattcta 
caaccatctg gagggcaata tcatcttctc tagatgoatt ttacagattt tcccggccac 



60 
120 
180 
240 
300 
360 
420 

480 

540 

600 

660 

677 
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atactgtcat aggaacagca ttaagcatag teteagttte cettctagct gtccagagct 
tgtctgatat atcacctttg ttcctcactg gtttgctgga ggcagtggta gctgcccttt 
5 tcatgaatat ctatattgtt ggactgaacc agttattcga cattgagata gacaaggtta 
acaagccaac tcttccattg gcatctgggg aatacaccct tgcaactggg gttgcaatag 
Q tttcggtctt tgccgctatg agctttggcc ttggatgggc tgttggatca caacctctgt 
tttgggctct tttcataagc tttgttcttg ggactgcata ttcaatcaat ctgccgtacc 
ttcgatggaa gagatttgcfc gttgttgcag cactgtgcat attagcagtt cgtgcagtga 
3 ttgttcagct ggcctttttt ctccacattc agacttttgt tttcaggaga ccggcagtgt 
tttctaggcc attattattt gcaactggat ttatgacgtt cttctctgtt gtaatagcac 
( tattoaagga tatacctgac atcgaagggg accgcatatt cgggatccga tccttcagcg 
tccggttagg gcaaaagaag gtcttttgga tctgcgttgg cttgcttgag atggcctaca • 
gcgttgcgat actgatggga gctacctctt cctgtttgtg gagcaaaaca gcaacoatcg 
ctggccattc catacttgcc gcgatcctat ggagctgcgc gcgatcggtg gacttgacga 
gcaaagccgc aataacgtcc ttctacatgt tcatctggaa gctgttctac gcggagtacc 
tgctcatcco tctggtgcgg tgagcgogag gcgaggtggt ggoagacgga tcggcgtogg 
cggggcggca aacaactcca cgggagaact tgagtgccgg aagtaaactc ccgtttgaaa 
gttgaagcgt gcaccaocgg caccgggcag agagagacac ggtggctgga tggatacgga 
tggcccoccc aataaattcc cccgtgcatg gtaaaaaaaa aaaaaaaaaa a 



<210> 105 

<211> 1870 

<212> ONA 

<213> CORN 

<400> 105 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1431 



gccgcgcagc 


gcgacgagcg 


ccacctgctt gctgccgcgt gcctgcgtgc gtgtgcgtcc 


60 


accactgacc 


ccgcgcccgc 


ccgccgcccc tgcccctcca ctccacttgc tcactcgtcg 


120 


cggcccgctt 


cccccccggc 


caagggatgg acgcgcttcg cctacggccg tccctcctcc 


180 


ccgtgcggcc 


cggcgcggcc 


cgcccgcgag atcattttct accaccatgt tgttccatac 


240 


aacgaaatgg tgaaggacga 


atttgctttt ctagccaaag gacccaaggt cctaccttgc 


300 


atcaccatca 


gaaattcttc 


gaatggaaat cctcctattg taggatatca catcggtcat 

58 


360 
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420 
480 
540 
600 
660 
720 
780 
840 
900 
960 



taaataottc tgttaatgct tcggggcaac 

s ctacaaccat ct ggagggc a atatcatctt ctcta ga t g c attttaca.a ttttccc ggc 
cacatac tgt cata gg aaca gC attaa g ca tagtctcagt ttcccfctcta 
,ct tgt c tg a tataecacct ttgttC ctca Ctggtttgct 
10 tt t tcat g aa tatctatatt .tt^a acc agtt att c g acatt gag ata gaC aa gg 

ttaacaa gcC aactcttcca tt gg ca tc t g gggaataoac cottgcaact 
^ t ag tttc ggt ctt tgccg c t a tgag ctt tg gC ctt ggatg tcacaacctc 
tgttttgggo tcttttcata a gct tt g ttc ttggg act g c atafctcaatc aatct g =c g t 
acc ttcg a tg g aa g a ga ttt gctgttgttg ca gca ct g t g catatt agca gttegtge^ 
20 tg at tgttca gctggccttt tttcfcccaca fctcagacttt tgtfcttcagg ^ 

tsrttttct.. .ccattatta tttgC aac tg g a t tta tgac gttcttctct ^ 
2s cactattcaa gg atatacc t gacatcgaag gggaccgcat ^ 
,c g tcc ggtt agggcaaaag aaggtctttfc ggafcctgcgt t ^ ttgctt ^ 

aca gcg tt g c g ataot ga t g ggagctacct cttcctgttt crtacraa^^ 

3 cc a c 9Fa g caaa aca g caacca 1200 

30 tc g ct ggc ca ttccatactt gccacaatcp t- al -,T„=~ 

gccgcgatcc tatggagctg c gcg c g atc g gtggacttga 1260 

Coctt «. c . tgttoaCoEg ^ tacg=ggagt i32o 

3s a« tg =te, t ccc t <, tggtg cg9lg , gcgo ^ 
c*rc gggg „ g gca ^, oaact c „ =gg9aga >cttg>gt3c oMM9tiM ^ 

40 gg at ggc ccc cccaataaat tcoeccgtgc a taot s ,^n 

cgtjjc atggtacccc ac g ot g ott g at g atatccc i 56 o 

-<*«». ^ c tg a tcgtct ot wt tggttgolc . i62o 

45 O .= tgot , gt , tgataoKC ttTCtagtec ttwo 1S80 

ca 3 t g ,ooc aaottggtog gctgag=tca gcg=tcigca ^ 
tt OT c ttgtg ogcCagcatg aatgacgtat . ^ ^ 

=o t tegtc , gtc tggg « gtgt tttgtgtccg wtcg tot3tcasag ^ 

cctcgctgct 

1870 

55 <210> 106 
<211> 642 



59 
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<212> DNA 
<213> CORN 

<400> 106 

5 cggccggact cttctgactt ggcaaccgcc gcgcagcgcg acgagcgcca cctgcttgct 60 

gccgcgtgcc tgcgtgcgtg tgcgtccacc actgaccccg cgcccgcccg ccgcccctgc 120 

iQ ccctccactc cacttgctca ctcgtcggct cgtcgcggcc cgcttccccc ccggccaagg 180 

gatggacgcg cttcgoctac ggccgtccct cctccccgtg cggcccggcg cggcccgccc 240 

gcgaggcagt ggtagctgcc cttttcatga atatctatat tgttggactg aaccagttat 300 

15 tcgacattga gatagacaag gttaacaagc caactcttcc attggcatct ggggaataca 360 

cccttgcaac tggggttgca atagtttcgg totttgccgc tatgagcttt ggccttggat 420 

2Q gggctgttgg atcacaacct ctgttttggg ctctttfccat aagctttgtt cttgggactg 480 

catattcaat oaatctgccg taccttcgat ggaagagatt tgctgttgtt gcagcactgt 540 

gcatattago agttcgtgca gtgattgttc agctggcctt ttttctccac attoagactt 600 

25 ttgttttcag gagaccggca gtgttttcta ggccattatt at 64 



<210> 107 

<211> 3S2 

30 <212> DNA 

<213> COTTON 

<400> 107 



^ cccacgcgtc cgaacattgt ttgcacttgt tattgccata accaaggatc ttccagatgt 
agaaggagat cgcaaatttc aaatatcaac attagcaaca aagcttggag ttagaaatat 
tgcatttctt ggttccggac ttctactggt gaattatgtt gctgctgtgt tggctgcaat 

40 atacatgcct caggctttca ggcgtagttt aatgatacct gctcatatct ttttggcggt 
ctgcttgatt tttcagacat gggtgttgga acaagcaaat tacaaaaagg aagcaatctc 
^ ggggttctat cgtttcatat ggaatctctt ctatgcagag tatgcgattt tccccttcgt 



60 
120 
180 
240 
300 
360 
362 



<210> 108 
50 <211> 575 



60 
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<212> DNA 
<213> TOMATO 



5 



10 



<400> 108 

cagatcaatt ccagttcctg ctgagttttc tccactcaaa accagttcac atgcaatagt 
acgggttttg aaatgtaaag oatggaagag accaaaaaag cactattcct cttcaatgaa 
gttgcagcgg cagtatatca cgcaagagca tgttggagga agtgatctaa gcactattgc 
tgctgataaa aaacttaaag ggagattttt ggtgcacgca tcatctgaac accctcttga 
atctcaacct totaaaagtc cttgggactc agttaatgat gccgtagatg ctttctacag 
15 gttctcgcgg ccccatacca taataggaac ageattgage ataatttcag tttctctcct 
tgcagttgag aagttctctg atttttctoc attatttttc actggggtgt tagaggccat 
tgttgctgco ctattcatga acatttacat agttggttta aaccagttgt ctgacatcga 
aatagacaag gtaaacaago catatcttcc attggcatca ggggaatact ctgtacaaac 
tggagtgatt gttgtgtcgt cttttgccat tttga 



20 



25 



0 



<210> 109 

<211> 1663 

<212> DNA. 

<213> ARABIDOPSIS 

<400> 109 



aacaccaaac acacaatttc acattctttt gcatatttct tcttcttctt ccattatgga 
5 gatacggagc ttgattgttt ctatgaaccc taatttatct tcctttgagc tctctcgccc 
tgtatctcct ctcactcgct cactagttcc gttccgatcg actaaactag ttccccgctc 
catttctagg gggatcccgt cgatctccac cccgaatagt gaaactgaca agatctccgt 
3 taaacctgtt tacgtcccga cgtctcccaa tcgcgaactc cggactcctc acagtggata 
ccatttcgat ggaacacctc ggaagttctt cgagggatgg tggatccggg tttccatccc 
agagaagagg gagagttttt gttttatgta ttctgtggag aatcctgcat ttcggcagag 
tttgtcacca ttggaagtgg ctctatatgg acctagattc actggtgttg gagctcagat 
tcttggcgct aatgataaat atttatgcca atacgaacaa gactctcaca atttctgggg 
agatcgacat gagctagttt tggggaatac ttttagtgct gtgccaggcg caaaggctcc 
aaacaaggag gttccaccag aggaatttaa cagaagagtg tccgaagggt tccaagctac 



60 
120 
180 
240 
300 
360 
420 
480 
540 
575 
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tccattttgg catcaaggtc acatttgcga tgatggccgt aotgactatg cggaaactgt 
s gaaatctgct cgttgggagt atagtactcg tcccgtttac ggttggggtg atgttggggc 
caaacagaag tcaactgcag gctggcctgc agettttcct gtatttgagc ctcattggca 
gatatgcatg gcaggaggcc tttccacagg gtggatagaa tggggcggtg aaaggtttga 
10 gtttcgggat goaccttctt attcagagaa gaattggggt ggaggcttcc caagaaaatg 
gttttgggtc cagtgtaatg tctttgaagg ggcaactgga gaagttgctt taaccgcagg 
is tggcgggttg aggcaattgc ctggattgac tgagacctat gaaaatgctg cactggtttg 
tgtacactat gatggaaaaa tgtacgagtt tgttccttgg aatggtgttg ttagatggga 
aatgtctcco tggggttatt ggtatataac tgcagagaac gaaaaccatg tggtggaact 
20 agaggcaaga acaaatgaag cgggtacaco tctgcgtgct cctaccaoag aagttgggct 
agctacggct tgcagagata gttgttacgg tgaattgaag ttgcagatat gggaacggct 
^ atatgatgga agtaaaggca aggtgatatt agagacaaag agctcaatgg cagcagtgga 
gataggagga ggaccgtggt ttgggacatg gaaaggagat acgagcaaca cgcccgagct 
actaaaacag gctcttcagg tcccattgga tcttgaaagc gccttaggtt tggtcccttt 
30 cttcaagcca ccgggtctgt aacattgatg agtgttttgt ttgttgatag agacccatgt 
gatgaatgaa gccttagtca tgtcattgct agcttcacta ttatgtatgt atgattttag 
ttcgttcggt ccttgtggta aatgatacgg gccagtgtaa agt 



35 



<210> 110 

<211> 488 

<212> PRT 

40 «:213> ARABIDOPSIS 

<400> HQ 



45 



Met Glu He Arg Ser Leu lie Val Ser Met As* Pro Asn Leu Ser Ser 



10 1S 



Phe Glu Leu Ser Arg Pro Val Ser Pro Leu Thr Arg Ser Leu Val Pro 

25 30 

50 Phe Arg S er Thr Lys Leu Val Pro Arg Ser He Ser Arg Val Ser Ala 

40 45 

Ser lie Ser Thr Pro Asn Ser Glu Thr Asp Lys He Ser Val Lys Pro 



720 
780 
840 
900 
960 
102Q 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1663 
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55 60 
Val Tyr Val Pro Thr Ser Pro Asn Arg Glu Leu Arg Thr Pro His Ser 
5 70 75 80 

Gly Tyr His Phe Asp Gly Thr P ro Arg Lys Phe Phe Glu Gly Trp Tyr 

90 95 

iQ Phe Arg Val Ser lie Pro Glu Lys Arg Glu Ser Phe Cys Phe Met Tyr 

105 110 

Ser Val Glu Asn Pro Ala Phe Arg Gin Ser Leu Ser Pro Leu Glu Val 

120 125 

15 Ala Leu Tyr Gly Pro Arg Phe Thr Gly Val Gly Ala Gin lie Leu Gly 

135 14Q 

Ala Asn Asp Lys Tyr Leu Cys Gin Tyr Glu Gin Asp Ser His Asn Phe 
20 150 "5 16Q 

Trp Gly Asp Arg His Glu Leu Val Leu Gly Asn Thr Phe Ser Ala Val 

165 170 175 

^ Pro Gly Ala Lys Ala Pro Asn Lys Glu Val Pro Pro Glu Glu Phe Asn 

185 19Q 

Arg Arg Val Ser Glu Gly Phe Gin Ala Thr Pro Phe Trp His Gin Gly 

200 205 

30 His lie Cys Asp Asp Gly Arg Thr Asp Tyr Ala Glu Thr Val Lys Ser 

<i - Lb 220 



Ala Arg Trp Glu Tyr Ser Thr Arg Pro Val Tyr Gly Trp Gly Asp Val 
35 230 235 240 

Gly Ala Lys Gin Lys Ser Thr Ala Gly Trp Pro Ala Ala Phe Pro Val 

250 255 
4Q Phe Glu Pro His Trp Gin He Cys Met Ala Gly Gly Leu Ser Thr Gly 

265 270 

Trp lie Glu Trp Gly Gly Glu Arg Phe Glu Phe Arg Asp Ala Pro Ser 

280 285 

45 Tyr Ser Glu Lys Asn Trp Gly Gly Gly Phe Pro Arg Lys Trp Phe Trp 

295 300 

val Gin cys Asn Val Phe Glu Gly Ala Thr Gly Glu Val Ala Leu Thr 
50 315 320 

Ala Gly Gly Gly Leu Arg Gin Leu Pro Gly Leu Thr Glu Thr Tyr Glu 



330 



335 

55 *" Wa iTo ^ ^ ^ H±S G1 * ^ s Met **r Glu Phe 



345 350 



Val Pro Trp Asn Gly Val Val Arg Trp Glu Met Ser Pro Trp Gly Tyr 

360 365 



63 



WO 01/79472 



PCT/US01/12334 



Trp Tyr He Thr Ala Glu Asn Glu Asn His Val Val Glu 



370 



375 



Leu Glu Ala 



380 



Arg Thr Asn Glu Ala Gly Thr Pro t«i a>-~ m, « 

385 xar * ro Leu ^9 Ala Pro Thr Thr Glu Val 



390 



395 



400 



Gly Leu Ala Thr Ala Cys Arg Asp Ser Cys Tyr Gly Glu Leu Lys Leu 



410 



Gin He Trp Glu Arg Leu Tyr Asp Gly Ser Lys Gly Lys 



425 



415 

Val He Leu 
430 



Glu Thr Lys Ser Ser Met Ala Ala Val Glu He Gly Gly Gly Pro Trp 

440 AAC 



445 



Phe Gly Thr Trp Lys Gly Asp Thr Ser Asn Thr Pro Glu Leu Leu Lys 

" u 455 



460 



Gin Ala Leu Gin Val Pro Leu Asp Leu Glu Ser Ala Leu Gly Leu Val 



470 



475 



480 



Pro Phe Phe Lys Pro Pro Gly Leu 

485 



64 



